Skip to content

Conversation

@iamjustinhsu
Copy link
Contributor

@iamjustinhsu iamjustinhsu commented Dec 11, 2025

For my future autoscaler changes, I would like to divide and multiply execution resources. See https://github.com/anyscale/rayturbo/pull/2778 for more info.

a = ExecutionResources(memory=1E-6)
b = ExecutionResources(memory=1E6)
a.multiply(b)

In the example above, a * b returns 0 due to rounding errors. Ideally, this should be preserved. The approach I do is to use _cpu, _gpu, _memory, and _object_store_memory hold the true precision, but cpu, gpu, memory, and object_store_memory do not.

@iamjustinhsu iamjustinhsu requested a review from a team as a code owner December 11, 2025 00:14
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request makes a solid improvement by separating the internal high-precision resource values from the public, rounded properties. This effectively addresses the precision loss issue described. The addition of __truediv__ and __mul__ is a good extension for the class's functionality.

My main feedback is to add safe handling for division-by-zero in __truediv__ to prevent potential runtime errors. I've also suggested reordering arguments in the new dunder methods for better consistency with the rest of the class. Overall, these are great changes.

Comment on lines 269 to 278
def __truediv__(self, other: "ExecutionResources") -> "ExecutionResources":
# NOTE: We add access each resource privately because we want to preserve the
# decimal precision. The public properties will now on runtime safe_round to
# 5 decimals.
return ExecutionResources(
cpu=self._cpu / other._cpu,
gpu=self._gpu / other._gpu,
memory=self._memory / other._memory,
object_store_memory=self._object_store_memory / other._object_store_memory,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This implementation of __truediv__ doesn't handle division by zero, which will raise a ZeroDivisionError if any resource in other is zero. This can lead to unexpected crashes. It's better to define a safe behavior for this case. For resource calculations, a common approach is to treat x / 0 as inf (for x > 0) and 0 / 0 as 0.

I've also reordered the arguments to match the __init__ method for consistency and fixed a minor typo in the comment.

    def __truediv__(self, other: "ExecutionResources") -> "ExecutionResources":
        # NOTE: We access each resource privately because we want to preserve the
        # decimal precision. The public properties will now on runtime safe_round to
        # 5 decimals.
        def _safe_div(a, b):
            if b == 0.0:
                return float("inf") if a != 0.0 else 0.0
            return a / b

        return ExecutionResources(
            cpu=_safe_div(self._cpu, other._cpu),
            gpu=_safe_div(self._gpu, other._gpu),
            object_store_memory=_safe_div(
                self._object_store_memory, other._object_store_memory
            ),
            memory=_safe_div(self._memory, other._memory),
        )

Comment on lines 280 to 286
def __mul__(self, other: "ExecutionResources") -> "ExecutionResources":
return ExecutionResources(
cpu=self._cpu * other._cpu,
gpu=self._gpu * other._gpu,
memory=self._memory * other._memory,
object_store_memory=self._object_store_memory * other._object_store_memory,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency with the __init__ method and other methods in this class, the keyword arguments for ExecutionResources should be in the order cpu, gpu, object_store_memory, memory. This improves readability and maintainability.

    def __mul__(self, other: "ExecutionResources") -> "ExecutionResources":
        return ExecutionResources(
            cpu=self._cpu * other._cpu,
            gpu=self._gpu * other._gpu,
            object_store_memory=self._object_store_memory * other._object_store_memory,
            memory=self._memory * other._memory,
        )

Signed-off-by: iamjustinhsu <[email protected]>
Signed-off-by: iamjustinhsu <[email protected]>
@iamjustinhsu iamjustinhsu changed the title [data] Add more dunder methods to ExecutionResources [data] Add more div/multiply methods to ExecutionResources Dec 11, 2025
return float("nan")
if a > 0.0:
return float("inf")
return float("-inf")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Division of NaN by zero returns wrong value

The safe_div function mishandles the case where the numerator is NaN and the denominator is 0.0. Since nan == 0.0 is False and nan > 0.0 is also False, the function falls through to return -inf instead of the mathematically correct nan. Adding an explicit math.isnan check at the start of safe_div would fix this edge case.

Fix in Cursor Fix in Web

@ray-gardener ray-gardener bot added the data Ray Data-related issues label Dec 11, 2025
@iamjustinhsu iamjustinhsu deleted the jhsu/make-execution-resources-dividable branch December 11, 2025 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data Ray Data-related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant