Skip to content

Conversation

@leofang
Copy link
Member

@leofang leofang commented Sep 26, 2025

Description

closes #779.

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Sep 26, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@leofang leofang self-assigned this Sep 26, 2025
@leofang leofang added P0 High priority - Must do! cuda.core Everything related to the cuda.core module breaking Breaking changes are introduced labels Sep 26, 2025
@leofang leofang added this to the cuda.core beta 7 milestone Sep 26, 2025
@leofang
Copy link
Member Author

leofang commented Sep 26, 2025

/ok to test 738eed4

@github-actions

This comment has been minimized.

@leofang leofang added the blocked This task is currently blocked by other tasks label Oct 5, 2025
@leofang
Copy link
Member Author

leofang commented Oct 5, 2025

(This PR is now based entirely on #1070 except for the last commit, so that PR needs to be merged first.)

@leofang
Copy link
Member Author

leofang commented Oct 5, 2025

/ok to test 605241e

@leofang leofang requested a review from shwina October 6, 2025 13:13
Copy link
Collaborator

@kkraus14 kkraus14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way we could add some testing for stream ordered allocations / deallocations?

@kkraus14
Copy link
Collaborator

kkraus14 commented Oct 7, 2025

Changes LGTM on top of #1070

@leofang leofang removed the blocked This task is currently blocked by other tasks label Oct 7, 2025
@leofang leofang marked this pull request as ready for review October 7, 2025 19:28
@leofang
Copy link
Member Author

leofang commented Oct 7, 2025

Merged with main!

@leofang
Copy link
Member Author

leofang commented Oct 7, 2025

/ok to test 33345d6

@leofang leofang requested review from kkraus14 and rparolin October 7, 2025 20:01
If the buffer is deallocated without an explicit stream, the allocation stream
is used.
"""
if stream is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we intend to remove this? Based on the deallocate API signature a user would deem it valid to pass None as a stream argument. That None is not checked in the _deallocate function above when the stream object is called with it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very interesting question that's worth expanding.. From my perspective mr.deallocate() is not meant to be called by end users because

  1. it was designed to be called by Buffer.close() or its destructor when out of scope
  2. it has such an ugly, not pythonic signature (passing pointer and size)

I don't think currently we have any test that actually exercise calling mr.deallocate() explicitly.

If this can be called by the end users, the contract that we deallocate using the allocation stream would not be upheld, because this API does not take any Buffer argument and so the stream is not available to us.

@leofang leofang merged commit 4cbb627 into NVIDIA:main Oct 7, 2025
71 checks passed
@leofang leofang deleted the dealloc_stream branch October 7, 2025 22:33
@leofang
Copy link
Member Author

leofang commented Oct 7, 2025

Thanks, all! I noticed I forgot to prepare a release note for this PR. Working on it now...

@leofang
Copy link
Member Author

leofang commented Oct 7, 2025

Working on it now...

#1097

@github-actions
Copy link

github-actions bot commented Oct 7, 2025

Doc Preview CI
Preview removed because the pull request was closed or merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Breaking changes are introduced cuda.core Everything related to the cuda.core module P0 High priority - Must do!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Buffer deallocation always happens on default stream

4 participants