Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the HivemindStrategy #16407

Merged
merged 1 commit into from
Jan 18, 2023
Merged

Remove the HivemindStrategy #16407

merged 1 commit into from
Jan 18, 2023

Conversation

carmocca
Copy link
Contributor

@carmocca carmocca commented Jan 17, 2023

What does this PR do?

Removes the HivemindStrategy()

The code for this strategy is conveniently self-contained. We will move the implementation to a separate repository (TBD) which tests support with newer PL versions.

Related Lightning-Universe/lightning-Hivemind#15 (to be transferred)

Does your PR introduce any breaking changes? If yes, please list them.

Removes the HivemindStrategy()

cc @justusschock @awaelchli @Borda

@carmocca carmocca added refactor breaking change Includes a breaking change strategy: hivemind (external) pl Generic label for PyTorch Lightning package labels Jan 17, 2023
@carmocca carmocca self-assigned this Jan 17, 2023
@carmocca carmocca marked this pull request as ready for review January 17, 2023 17:46
@github-actions
Copy link
Contributor

github-actions bot commented Jan 17, 2023

⛈️ Required checks status: Has failure 🔴

Warning
This job will need to be re-run to merge your PR. If you do not have write access to the repository, you can ask Lightning-AI/lai-frameworks to re-run it. If you push a new commit, all of CI will re-trigger.

Groups summary

🟢 pytorch_lightning: Tests workflow
Check ID Status
pl-cpu (macOS-11, pytorch, 3.8, 1.11) success
pl-cpu (macOS-11, pytorch, 3.9, 1.12) success
pl-cpu (macOS-11, pytorch, 3.10, 1.13) success
pl-cpu (macOS-11, pytorch, 3.8, 1.10, oldest) success
pl-cpu (ubuntu-20.04, pytorch, 3.8, 1.10) success
pl-cpu (ubuntu-20.04, pytorch, 3.9, 1.11) success
pl-cpu (ubuntu-20.04, pytorch, 3.10, 1.12) success
pl-cpu (ubuntu-20.04, pytorch, 3.10, 1.13) success
pl-cpu (ubuntu-20.04, pytorch, 3.7, 1.10, oldest) success
pl-cpu (windows-2022, pytorch, 3.9, 1.11) success
pl-cpu (windows-2022, pytorch, 3.10, 1.12) success
pl-cpu (windows-2022, pytorch, 3.10, 1.13) success
pl-cpu (windows-2022, pytorch, 3.7, 1.10, oldest) success
pl-cpu (slow, macOS-11, pytorch, 3.7, 1.11) success
pl-cpu (slow, ubuntu-20.04, pytorch, 3.7, 1.11) success
pl-cpu (slow, windows-2022, pytorch, 3.7, 1.11) success
pl-cpu (macOS-11, lightning, 3.8, 1.13) success
pl-cpu (ubuntu-20.04, lightning, 3.8, 1.13) success
pl-cpu (windows-2022, lightning, 3.8, 1.13) success

These checks are required after the changes to requirements/pytorch/strategies.txt, src/pytorch_lightning/strategies/__init__.py, src/pytorch_lightning/strategies/hivemind.py, src/pytorch_lightning/utilities/__init__.py, src/pytorch_lightning/utilities/imports.py, tests/tests_pytorch/helpers/runif.py, tests/tests_pytorch/strategies/test_hivemind.py.

🟢 pytorch_lightning: Azure GPU
Check ID Status
pytorch-lightning (GPUs) success

These checks are required after the changes to requirements/pytorch/strategies.txt, src/pytorch_lightning/strategies/__init__.py, src/pytorch_lightning/strategies/hivemind.py, src/pytorch_lightning/utilities/__init__.py, src/pytorch_lightning/utilities/imports.py, tests/tests_pytorch/helpers/runif.py, tests/tests_pytorch/strategies/test_hivemind.py.

🟢 pytorch_lightning: Benchmarks
Check ID Status
pytorch-lightning.Benchmark success

These checks are required after the changes to requirements/pytorch/strategies.txt.

🟢 pytorch_lightning: Azure HPU
Check ID Status
pytorch-lightning (HPUs) success

These checks are required after the changes to requirements/pytorch/strategies.txt, src/pytorch_lightning/strategies/__init__.py, src/pytorch_lightning/strategies/hivemind.py, src/pytorch_lightning/utilities/__init__.py, src/pytorch_lightning/utilities/imports.py, tests/tests_pytorch/helpers/runif.py, tests/tests_pytorch/strategies/test_hivemind.py.

🟢 pytorch_lightning: Azure IPU
Check ID Status
pytorch-lightning (IPUs) success

These checks are required after the changes to requirements/pytorch/strategies.txt, src/pytorch_lightning/strategies/__init__.py, src/pytorch_lightning/strategies/hivemind.py, src/pytorch_lightning/utilities/__init__.py, src/pytorch_lightning/utilities/imports.py, tests/tests_pytorch/helpers/runif.py, tests/tests_pytorch/strategies/test_hivemind.py.

🟢 pytorch_lightning: Docs
Check ID Status
make-doctest (pytorch) success
make-html (pytorch) success

These checks are required after the changes to src/pytorch_lightning/strategies/__init__.py, src/pytorch_lightning/strategies/hivemind.py, src/pytorch_lightning/utilities/__init__.py, src/pytorch_lightning/utilities/imports.py, docs/source-pytorch/api_references.rst, docs/source-pytorch/common_usecases.rst, docs/source-pytorch/extensions/strategy.rst, docs/source-pytorch/index.rst, docs/source-pytorch/strategies/hivemind.rst, docs/source-pytorch/strategies/hivemind_basic.rst, docs/source-pytorch/strategies/hivemind_expert.rst, docs/source-pytorch/strategies/hivemind_intermediate.rst, requirements/pytorch/strategies.txt.

🔴 pytorch_lightning: Docker
Check ID Status
build-cuda (3.9, 1.10, 11.3.1) success
build-cuda (3.9, 1.11, 11.3.1) success
build-cuda (3.9, 1.12, 11.6.1) success
build-cuda (3.9, 1.13, 11.7.1) success
build-hpu (1.5.0, 1.11.0) success
build-ipu (3.9, 1.10) success
build-NGC skipped
build-pl (3.9, 1.10, 11.3.1) success
build-pl (3.9, 1.11, 11.3.1) success
build-pl (3.9, 1.12, 11.6.1) success
build-pl (3.9, 1.13, 11.7.1) success
build-xla (3.7, 1.12) success

These checks are required after the changes to requirements/pytorch/strategies.txt.

🟢 mypy
Check ID Status
mypy success

These checks are required after the changes to requirements/pytorch/strategies.txt, src/pytorch_lightning/strategies/__init__.py, src/pytorch_lightning/strategies/hivemind.py, src/pytorch_lightning/utilities/__init__.py, src/pytorch_lightning/utilities/imports.py.

🟢 install
Check ID Status
install-pkg (ubuntu-22.04, app, 3.7) success
install-pkg (ubuntu-22.04, app, 3.10) success
install-pkg (ubuntu-22.04, fabric, 3.7) success
install-pkg (ubuntu-22.04, fabric, 3.10) success
install-pkg (ubuntu-22.04, pytorch, 3.7) success
install-pkg (ubuntu-22.04, pytorch, 3.10) success
install-pkg (ubuntu-22.04, lightning, 3.7) success
install-pkg (ubuntu-22.04, lightning, 3.10) success
install-pkg (ubuntu-22.04, notset, 3.7) success
install-pkg (ubuntu-22.04, notset, 3.10) success
install-pkg (macOS-12, app, 3.7) success
install-pkg (macOS-12, app, 3.10) success
install-pkg (macOS-12, fabric, 3.7) success
install-pkg (macOS-12, fabric, 3.10) success
install-pkg (macOS-12, pytorch, 3.7) success
install-pkg (macOS-12, pytorch, 3.10) success
install-pkg (macOS-12, lightning, 3.7) success
install-pkg (macOS-12, lightning, 3.10) success
install-pkg (macOS-12, notset, 3.7) success
install-pkg (macOS-12, notset, 3.10) success
install-pkg (windows-2022, app, 3.7) success
install-pkg (windows-2022, app, 3.10) success
install-pkg (windows-2022, fabric, 3.7) success
install-pkg (windows-2022, fabric, 3.10) success
install-pkg (windows-2022, pytorch, 3.7) success
install-pkg (windows-2022, pytorch, 3.10) success
install-pkg (windows-2022, lightning, 3.7) success
install-pkg (windows-2022, lightning, 3.10) success
install-pkg (windows-2022, notset, 3.7) success
install-pkg (windows-2022, notset, 3.10) success

These checks are required after the changes to src/pytorch_lightning/strategies/__init__.py, src/pytorch_lightning/strategies/hivemind.py, src/pytorch_lightning/utilities/__init__.py, src/pytorch_lightning/utilities/imports.py, requirements/pytorch/strategies.txt.


Thank you for your contribution! 💜

Note
This comment is automatically generated and updates for 60 minutes every 180 seconds. If you have any other questions, contact carmocca for help.

Copy link
Member

@justusschock justusschock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote for moving it in a separate repository as it is the only thing we have that supports training on heterogeneous clusters of variable sizes (spot instances).

@mergify mergify bot added the ready PRs ready to be merged label Jan 18, 2023
@awaelchli
Copy link
Contributor

In addition to @justusschock suggestion, this could also make a great tutorial how to build a custom strategy, integrating a library like this, perhaps in the context of Fabric.

@Borda
Copy link
Member

Borda commented Jan 18, 2023

I vote for moving it in a separate repository as it is the only thing we have that supports training on heterogeneous clusters of variable sizes (spot instances).

I think that shall be for the time being preserved in separate repo

@carmocca
Copy link
Contributor Author

Marking as a draft until there's a decision

@carmocca carmocca marked this pull request as draft January 18, 2023 16:44
@mergify mergify bot removed the ready PRs ready to be merged label Jan 18, 2023
@lantiga
Copy link
Collaborator

lantiga commented Jan 18, 2023

I also vote for moving it out (and having it in ecosystem-ci), and second @awaelchli 's suggestion about demonstrating how to maintain a strategy for others that will want/need to do it.

@carmocca carmocca marked this pull request as ready for review January 18, 2023 17:24
@mergify mergify bot added the ready PRs ready to be merged label Jan 18, 2023
@carmocca carmocca enabled auto-merge (squash) January 18, 2023 17:53
@lexierule lexierule merged commit 7a99ae8 into lite/debug Jan 18, 2023
@lexierule lexierule deleted the lite/debug-hivemind branch January 18, 2023 18:10
@carmocca carmocca added this to the 2.0 milestone Jan 19, 2023
carmocca added a commit that referenced this pull request Jan 19, 2023
Remove the collaborative strategy
carmocca added a commit that referenced this pull request Jan 19, 2023
Remove the collaborative strategy
carmocca added a commit that referenced this pull request Jan 19, 2023
Remove the collaborative strategy
lantiga pushed a commit that referenced this pull request Jan 19, 2023
Remove the collaborative strategy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Includes a breaking change pl Generic label for PyTorch Lightning package ready PRs ready to be merged refactor strategy: hivemind (external)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants