-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement @data_distributed Decorator #1098
Conversation
PR updated based on the comments. |
Initial implemented of data_distributed
06b1c97
to
fc1c5c6
Compare
fc1c5c6
to
3f71b8d
Compare
|
||
|
||
def data_distributed(method): | ||
"""This decorator makes a target method of a module capable of being data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that this decorator can only apply to methods of Algorithm
? If so, the docstring should clarify this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is just for Algorithm
, but actually all nn.Module
derived classes. One of the prerequisite is that it needs to have self._ddp_activated_rank
, which is stated in the docstring as a contract.
* Implement @data_distributed Initial implemented of data_distributed * Update the documentation according to the comments. * Address comments
Motivation
Part of the effort to #1096 .
Originally we only "DDP"-fied the
unroll
ofRLAlgorithm
, but later we found that such DDP wrapping can be made more general. To be more specific, we want to have a generally applicable, simple and transparent utility for this. It is almost always the case that we want to wrap a method of a torch Module. Therefore, a decorator for module methods is desirable.Solution
Implemented the
@data_distributed
decorator that can be applied on Module (such asRLAlogirhtm
) methods. It will convert the method to be DDP-capable. A DDP-capable method will run with DDP wrapper automatically if the module it belongs to has DDP activated. A module is DDP-acitvated if its_ddp_activated_rank
is set to a non-negative value. ForAlgorithm
and its derived classes, this can be done by callingactivate_ddp(rank)
.Also removed
_UnrollPerformer
, and use@data_distributed
to wrapunroll
instead. The code should be much cleaner.NOTE: this still only enables DDP for the on-policy training. Will follow up on off-policy training too, which is also based on this PR.
Testing
Run
ac_breakout
with both DDP turned on and off.Note that with and without DDP, the training curve almost coincides.