Implement @data_distributed Decorator #1098

breakds · 2021-12-01T03:05:47Z

Motivation

Part of the effort to #1096 .

Originally we only "DDP"-fied the unroll of RLAlgorithm, but later we found that such DDP wrapping can be made more general. To be more specific, we want to have a generally applicable, simple and transparent utility for this. It is almost always the case that we want to wrap a method of a torch Module. Therefore, a decorator for module methods is desirable.

Solution

Implemented the @data_distributed decorator that can be applied on Module (such as RLAlogirhtm) methods. It will convert the method to be DDP-capable. A DDP-capable method will run with DDP wrapper automatically if the module it belongs to has DDP activated. A module is DDP-acitvated if its _ddp_activated_rank is set to a non-negative value. For Algorithm and its derived classes, this can be done by calling activate_ddp(rank).

Also removed _UnrollPerformer, and use @data_distributed to wrap unroll instead. The code should be much cleaner.

NOTE: this still only enables DDP for the on-policy training. Will follow up on off-policy training too, which is also based on this PR.

Testing

Run ac_breakout with both DDP turned on and off.

Note that with and without DDP, the training curve almost coincides.

alf/utils/distributed.py

breakds · 2021-12-01T22:06:42Z

PR updated based on the comments.

Initial implemented of data_distributed

alf/algorithms/algorithm.py

alf/utils/distributed.py

hnyu · 2021-12-01T23:02:36Z

alf/utils/distributed.py

+
+
+def data_distributed(method):
+    """This decorator makes a target method of a module capable of being data


It seems that this decorator can only apply to methods of Algorithm? If so, the docstring should clarify this.

It is just for Algorithm, but actually all nn.Module derived classes. One of the prerequisite is that it needs to have self._ddp_activated_rank, which is stated in the docstring as a contract.

* Implement @data_distributed Initial implemented of data_distributed * Update the documentation according to the comments. * Address comments

breakds requested review from emailweixu and hnyu and removed request for emailweixu December 1, 2021 03:06

breakds added the enhancement New feature or request label Dec 1, 2021

breakds mentioned this pull request Dec 1, 2021

Multi-GPU Training with DDP #1096

Open

15 tasks

emailweixu reviewed Dec 1, 2021

View reviewed changes

alf/utils/distributed.py Outdated Show resolved Hide resolved

alf/utils/distributed.py Show resolved Hide resolved

Implement @data_distributed

50a9f60

Initial implemented of data_distributed

breakds force-pushed the PR/breakds/data_distributed_decorator branch from 06b1c97 to fc1c5c6 Compare December 1, 2021 22:57

Update the documentation according to the comments.

3f71b8d

breakds force-pushed the PR/breakds/data_distributed_decorator branch from fc1c5c6 to 3f71b8d Compare December 1, 2021 22:58

hnyu requested changes Dec 1, 2021

View reviewed changes

breakds mentioned this pull request Dec 1, 2021

Enable DDP for Off-Policy training branch #1099

Merged

Address comments

dec3775

hnyu approved these changes Dec 2, 2021

View reviewed changes

breakds merged commit de2c6a1 into pytorch Dec 2, 2021

breakds deleted the PR/breakds/data_distributed_decorator branch December 2, 2021 22:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement @data_distributed Decorator #1098

Implement @data_distributed Decorator #1098

breakds commented Dec 1, 2021 •

edited

Loading

breakds commented Dec 1, 2021

hnyu Dec 1, 2021

breakds Dec 1, 2021



		def data_distributed(method):
		"""This decorator makes a target method of a module capable of being data

Implement @data_distributed Decorator #1098

Implement @data_distributed Decorator #1098

Conversation

breakds commented Dec 1, 2021 • edited Loading

Motivation

Solution

Testing

breakds commented Dec 1, 2021

hnyu Dec 1, 2021

Choose a reason for hiding this comment

breakds Dec 1, 2021

Choose a reason for hiding this comment

breakds commented Dec 1, 2021 •

edited

Loading