Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ray Core] ray.wait with num_returns=1 is pretty slow #49905

Open
pcmoritz opened this issue Jan 17, 2025 · 0 comments
Open

[Ray Core] ray.wait with num_returns=1 is pretty slow #49905

pcmoritz opened this issue Jan 17, 2025 · 0 comments
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core P1 Issue that should be fixed within a few weeks

Comments

@pcmoritz
Copy link
Contributor

pcmoritz commented Jan 17, 2025

What happened + What you expected to happen

I have a feeling that unbatched ray.wait is pretty slow -- while I expect it to be slower than batched ray.wait, the difference is pretty extreme. We can probably optimize this.

In [3]: import ray

In [4]: @ray.remote
   ...: def f(i):
   ...:     return str(i)

In [8]: def g(rest):
   ...:     while True:
   ...:         done, rest = ray.wait(rest, num_returns=1)
   ...:         if len(rest) == 0:
   ...:             return
   ...:         ray.get(done)
   ...: 

In [9]: objs = [f.remote(i) for i in range(10000)]

In [10]: %time g(objs)
CPU times: user 7.05 s, sys: 375 ms, total: 7.42 s
Wall time: 7.52 s

In [11]: def g(rest):
    ...:     while True:
    ...:         done, rest = ray.wait(rest, num_returns=5)
    ...:         if len(rest) == 0:
    ...:             return
    ...:         ray.get(done)
    ...: 

In [12]: objs = [f.remote(i) for i in range(10000)]

In [13]: %time g(objs)
CPU times: user 1.48 s, sys: 99.9 ms, total: 1.58 s
Wall time: 1.58 s

In [14]: def g(rest):
    ...:     while True:
    ...:         done, rest = ray.wait(rest, num_returns=20)
    ...:         if len(rest) == 0:
    ...:             return
    ...:         ray.get(done)
    ...: 

In [15]: objs = [f.remote(i) for i in range(10000)]

In [16]: %time g(objs)
CPU times: user 457 ms, sys: 71 ms, total: 528 ms
Wall time: 531 ms

Versions / Dependencies

Ray 2.40

Reproduction script

see above

@pcmoritz pcmoritz added bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core P1 Issue that should be fixed within a few weeks triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jan 17, 2025
@jjyao jjyao removed the triage Needs triage (eg: priority, bug/not-bug, and owning component) label Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core P1 Issue that should be fixed within a few weeks
Projects
None yet
Development

No branches or pull requests

2 participants