-
Notifications
You must be signed in to change notification settings - Fork 60
separate comm init from getXCClComm #2090
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR separates communication initialization logic from the communication retrieval process by splitting the getXCCLComm method into two distinct functions.
- Extracted communication lookup logic into a new
getXCCLCommmethod that only retrieves existing communicators - Created a new
initXCCLCommmethod that handles the initialization of new communicators - Updated call sites to first check for existing communicators before initializing new ones
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/xccl/ProcessGroupXCCL.hpp | Added method declaration for new getXCCLComm overload and initXCCLComm method |
| src/xccl/ProcessGroupXCCL.cpp | Implemented the separation of concerns by creating distinct lookup and initialization methods, updated call sites |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
|
||
| op_id_++; | ||
| auto comm = getXCCLComm(key, device, opType, p2pRank, isSendRecvSelf); | ||
| std::shared_ptr<xcclComm_t> comm = getXCCLComm(key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like no difference with previous code, then why to separate as two APIs/steps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for align nccl, To facilitate future feature integration, we can first align this part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me updated code is better structured. I am supportive of the change.
Co-authored-by: Dmitry Rogozhkin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
|
||
| op_id_++; | ||
| auto comm = getXCCLComm(key, device, opType, p2pRank, isSendRecvSelf); | ||
| std::shared_ptr<xcclComm_t> comm = getXCCLComm(key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me updated code is better structured. I am supportive of the change.
No description provided.