Skip to content

Conversation

hlinsen
Copy link
Contributor

@hlinsen hlinsen commented Oct 7, 2025

In C++ when spawning a new thread the cuda driver by default creates a legacy default stream for each thread with unique ids.
When calling concurrent mode from Python, PDLP and Barrier were running on the same stream.
Now we explicitly create a new handle for Barrier with a new stream.

@hlinsen hlinsen requested review from a team as code owners October 7, 2025 05:42
@hlinsen hlinsen added non-breaking Introduces a non-breaking change bug Something isn't working labels Oct 7, 2025
@hlinsen hlinsen changed the title Explicitely create new stream Explicitely create new stream for Barrier concurrent Oct 7, 2025
// Otherwise, CUDA API calls to the problem stream may occur in both threads and throw graph
// capture off
auto barrier_handle = raft::handle_t(*op_problem.get_handle_ptr());
detail::problem_t<i_t, f_t> d_barrier_problem(problem);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
detail::problem_t<i_t, f_t> d_barrier_problem(problem);
auto barrier_straem = rmm::cuda_stream_per_thread;
raft::resource::set_cuda_stream(barrier_handle, barrier_stream);
detail::problem_t<i_t, f_t> d_barrier_problem(problem, barrier_handle);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that overload but I don't think it is needed. According to rmm doc and current problem_t passing a new stream will create a deep copy on that stream.

Copy link
Collaborator

@rgsl888prabhu rgsl888prabhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmake changes look good

Copy link
Contributor

@chris-maes chris-maes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rgsl888prabhu
Copy link
Collaborator

/merge

@rapids-bot rapids-bot bot merged commit fe76279 into NVIDIA:branch-25.10 Oct 7, 2025
168 of 174 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants