- Release Compatibility Matrix
- General Overview
- Cutting a release branch preparations
- Cutting release branches
- Drafting RCs (Release Candidates) for PyTorch and domain libraries
- Promoting RCs to Stable
- Additional Steps to prepare for release day
- Patch Releases
- Hardware / Software Support in Binary Build Matrix
- Special Topics
Following is the Release Compatibility Matrix for PyTorch releases:
PyTorch version | Python | Stable CUDA | Experimental CUDA |
---|---|---|---|
2.0 | >=3.8, <=3.11 | CUDA 11.7, CUDNN 8.5.0.96 | CUDA 11.8, CUDNN 8.7.0.84 |
1.13 | >=3.7, <=3.10 | CUDA 11.6, CUDNN 8.3.2.44 | CUDA 11.7, CUDNN 8.5.0.96 |
1.12 | >=3.7, <=3.10 | CUDA 11.3, CUDNN 8.3.2.44 | CUDA 11.6, CUDNN 8.3.2.44 |
Releasing a new version of PyTorch generally entails 3 major steps:
- Cutting a release branch preparations
- Cutting a release branch and making release branch specific changes
- Drafting RCs (Release Candidates), and merging cherry picks
- Promoting RCs to stable and performing release day tasks
-
Q: What is release branch cut ?
- A: When bulk of the tracked features merged into the main branch, the primary release engineer starts the release process of cutting the release branch by creating a new git branch based off of the current
main
development branch of PyTorch. This allows PyTorch development flow onmain
to continue uninterrupted, while the release engineering team focuses on stabilizing the release branch in order to release a series of release candidates (RC). The activities in the release branch include both regression and performance testing as well as polishing new features and fixing release-specific bugs. In general, new features are not added to the release branch after it was created.
- A: When bulk of the tracked features merged into the main branch, the primary release engineer starts the release process of cutting the release branch by creating a new git branch based off of the current
-
Q: What is cherry-pick ?
- A: A cherry pick is a process of propagating commits from the main into the release branch, utilizing git's built in cherry-pick feature. These commits are typically limited to small fixes or documentation updates to ensure that the release engineering team has sufficient time to complete a thorough round of testing on the release branch. To nominate a fix for cherry-picking, a separate pull request must be created against the respective release branch and then mentioned in the Release Tracker issue (example: pytorch#94937) following the template from the issue description. The comment nominating a particular cherry-pick for inclusion in the release should include the committed PR against main branch, the newly created cherry-pick PR, as well as the acceptance criteria for why the cherry-pick is needed in the first place.
Following Requirements needs to be met prior to final RC Cut:
- Resolve all outstanding issues in the milestones(for example 1.11.0)before first RC cut is completed. After RC cut is completed following script should be executed from builder repo in order to validate the presence of the fixes in the release branch :
python github_analyze.py --repo-path ~/local/pytorch --remote upstream --branch release/1.11 --milestone-id 26 --missing-in-branch
- Validate that all new workflows have been created in the PyTorch and domain libraries included in the release. Validate it against all dimensions of release matrix, including operating systems(Linux, MacOS, Windows), Python versions as well as CPU architectures(x86 and arm) and accelerator versions(CUDA, ROCm).
- All the nightly jobs for pytorch and domain libraries should be green. Validate this using following HUD links:
Release branches are typically cut from the branch viable/strict
as to ensure that tests are passing on the release branch.
There's a convenience script to create release branches from current viable/strict
. Perform following actions :
- Perform a fresh clone of pytorch repo using
git clone [email protected]:pytorch/pytorch.git
- Execute following command from PyTorch repository root folder:
DRY_RUN=disabled scripts/release/cut-release-branch.sh
This script should create 2 branches:
release/{MAJOR}.{MINOR}
orig/release/{MAJOR}.{MINOR}
Note: Release branches for individual domain libraries should be created after first release candidate build of PyTorch is available in staging channels (which happens about a week after PyTorch release branch has been created). This is absolutely required to allow sufficient testing time for each of the domain library. Domain libraries branch cut is performed by Domain Library POC.
Builder branch cut should be performed at the same time as Pytorch core branch cut. Convenience script can also be used domains as well as pytorch/builder
NOTE: RELEASE_VERSION only needs to be specified if version.txt is not available in root directory
DRY_RUN=disabled GIT_BRANCH_TO_CUT_FROM=main RELEASE_VERSION=1.11 scripts/release/cut-release-branch.sh
These are examples of changes that should be made to release branches so that CI / tooling can function normally on them:
- Update backwards compatibility tests to use RC binaries instead of nightlies
- Example: pytorch#77983 and pytorch#77986
- A release branches should also be created in
pytorch/xla
andpytorch/builder
repos and pinned inpytorch/pytorch
- Example: pytorch#86290 and pytorch#90506
- Update branch used in composite actions from trunk to release (for example, can be done by running
for i in .github/workflows/*.yml; do sed -i -e s#@master#@release/2.0# $i; done
These are examples of changes that should be made to the default branch after a release branch is cut
- Nightly versions should be updated in all version files to the next MINOR release (i.e. 0.9.0 -> 0.10.0) in the default branch:
- Example: pytorch#77984
Domain library branch cut is done a week after branch cut for the pytorch/pytorch
. The branch cut is performed by the Domain Library POC.
After the branch cut is performed, the Pytorch Dev Infra member should be informed of the branch cut and Domain Library specific change is required before Drafting RC for this domain library.
Follow these examples of PR that updates the version and sets RC Candidate upload channel:
- torchvision : pytorch/vision#5400
- torchaudio: pytorch/audio#2210
To draft RCs, a user with the necessary permissions can push a git tag to the main pytorch/pytorch
git repository. Please note: exactly same process is used for each of the domain library
The git tag for a release candidate must follow the following format:
v{MAJOR}.{MINOR}.{PATCH}-rc{RC_NUMBER}
An example of this would look like:
v1.12.0-rc1
You can use following commands to perform tag from pytorch core repo (not fork):
- Checkout and validate the repo history before tagging
git checkout release/1.12
git log --oneline
- Perform tag and push it to github (this will trigger the binary release build)
git tag -f v1.12.0-rc2
git push origin v1.12.0-rc2
Pushing a release candidate should trigger the binary_builds
workflow within CircleCI using pytorch/pytorch-probot
's trigger-circleci-workflows
functionality.
This trigger functionality is configured here: pytorch-circleci-labels.yml
To view the state of the release build, please navigate to HUD. And make sure all binary builds are successful.
Release candidates are currently stored in the following places:
- Wheels: https://download.pytorch.org/whl/test/
- Conda: https://anaconda.org/pytorch-test
- Libtorch: https://download.pytorch.org/libtorch/test
Backups are stored in a non-public S3 bucket at s3://pytorch-backup
Validate the release jobs for pytorch and domain libraries should be green. Validate this using following HUD links:
Validate that the documentation build has completed and generated entry corresponding to the release in docs folder of pytorch.github.io repository
Typically, within a release cycle fixes are necessary for regressions, test fixes, etc.
For fixes that are to go into a release after the release branch has been cut we typically employ the use of a cherry pick tracker.
An example of this would look like:
Please also make sure to add milestone target to the PR/issue, especially if it needs to be considered for inclusion into the dot release.
NOTE: The cherry pick process is not an invitation to add new features, it is mainly there to fix regressions
Promotion of RCs to stable is done with this script:
pytorch/builder:release/promote.sh
Users of that script should take care to update the versions necessary for the specific packages you are attempting to promote.
Promotion should occur in two steps:
- Promote S3 artifacts (wheels, libtorch) and Conda packages
- Promote S3 wheels to PyPI
NOTE: The promotion of wheels to PyPI can only be done once so take caution when attempting to promote wheels to PyPI, (see pypi/warehouse#726 for a discussion on potential draft releases within PyPI)
The following should be prepared for the release day
Need to modify release matrix for get started page. See following PR as reference.
After modifying published_versions.json you will need to regenerate the quick-start-module.js file run following command
python3 scripts/gen_quick_start_module.py >assets/quick-start-module.js
Please note: This PR needs to be merged on the release day and hence it should be absolutely free of any failures. To test this PR, open another test PR but pointing to the Release candidate location as above Release Candidate Storage
This is normally done right after the release is completed. We would need to create Google Colab Issue see following PR
A patch release is a maintenance release of PyTorch that includes fixes for regressions found in a previous minor release. Patch releases typically will bump the patch
version from semver (i.e. [major].[minor].[patch]
)
Patch releases should be considered if a regression meets the following criteria:
- Does the regression break core functionality (stable / beta features) including functionality in first party domain libraries?
- First party domain libraries:
- Is there not a viable workaround?
- Can the regression be solved simply or is it not overcomable?
NOTE: Patch releases should only be considered when functionality is broken, documentation does not typically fall within this category
Main POC: Patch Release Managers, Triage Reviewers
Patch releases should follow these high-level phases. This process starts immediately after the previous release has completed. Minor release process takes around 6-7 weeks to complete.
- Triage, is a process where issues are identified, graded, compared to Patch Release Criteria and added to Patch Release milestone. This process normally takes 2-3 weeks after the release completion.
- Patch Release: Go/No Go meeting between PyTorch Releng, PyTorch Core and Project Managers where potential issues triggering a release in milestones are reviewed, and following decisions are made:
- Should the new patch Release be created ?
- Timeline execution for the patch release
- Cherry picking phase starts after the decision is made to create patch release. At this point a new release tracker for the patch release is created, and an announcement will be made on official channels example announcement. The authors of the fixes to regressions will be asked to create their own cherry picks. This process normally takes 2 weeks.
- Building Binaries, Promotion to Stable and testing. After all cherry picks have been merged, Release Managers trigger new build and produce new release candidate. Announcement is made on the official channel about the RC availability at this point. This process normally takes 2 weeks.
- General Availability
Main POC: Triage Reviewers
- Tag issues / pull requests that are candidates for a potential patch release with
triage review
- Triage reviewers will then check if the regression / fix identified fits within above mentioned Patch Release Criteria
- Triage reviewers will then add the issue / pull request to the related milestone (i.e.
1.9.1
) if the regressions is found to be within the Patch Release Criteria
For patch releases issue tracker needs to be created. For patch release, we require all cherry-pick changes to have links to either a high-priority GitHub issue or a CI failure from previous RC. An example of this would look like:
Only following issues are accepted:
- Fixes to regressions against previous major version (e.g. regressions introduced in 1.13.0 from 1.12.0 are pickable for 1.13.1)
- Low risk critical fixes for: silent correctness, backwards compatibility, crashes, deadlocks, (large) memory leaks
- Fixes to new features being introduced in this release
- Documentation improvements
- Release branch specific changes (e.g. blocking ci fixes, change version identifiers)
Main POC: Patch Release Managers
- After regressions / fixes have been triaged Patch Release Managers will work together and build /announce a schedule for the patch release
- NOTE: Ideally this should be ~2-3 weeks after a regression has been identified to allow other regressions to be identified
- Patch Release Managers will work with the authors of the regressions / fixes to cherry pick their change into the related release branch (i.e.
release/1.9
for1.9.1
)- NOTE: Patch release managers should notify authors of the regressions to post a cherry picks for their changes. It is up to authors of the regressions to post a cherry pick. If cherry pick is not posted the issue will not be included in the release.
- If cherry picking deadline is missed by cherry pick author, patch release managers will not accept any requests after the fact.
Main POC: Patch Release managers
- Patch Release Managers will follow the process of Drafting RCs (Release Candidates)
- Patch Release Managers will follow the process of Promoting RCs to Stable
PyTorch has a support matrix across a couple of different axis. This section should be used as a decision making framework to drive hardware / software support decisions
For versions of Python that we support we follow the NEP 29 policy, which was originally drafted by numpy.
- All minor versions of Python released 42 months prior to the project, and at minimum the two latest minor versions.
- All minor versions of numpy released in the 24 months prior to the project, and at minimum the last three minor versions.
For accelerator software like CUDA and ROCm we will typically use the following criteria:
- Support latest 2 minor versions
In some instances support for a particular version of software will continue if a need is found. For example, our CUDA 11 binaries do not currently meet the size restrictions for publishing on PyPI so the default version that is published to PyPI is CUDA 10.2.
These special support cases will be handled on a case by case basis and support may be continued if current PyTorch maintainers feel as though there may still be a need to support these particular versions of software.
In the event a submodule cannot be fast forwarded, and a patch must be applied we can take two different approaches:
- (preferred) Fork the said repository under the pytorch GitHub organization, apply the patches we need there, and then switch our submodule to accept our fork.
- Get the dependencies maintainers to support a release branch for us
Editing submodule remotes can be easily done with: (running from the root of the git repository)
git config --file=.gitmodules -e
An example of this process can be found here: