You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.rst
+27-17Lines changed: 27 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -135,37 +135,47 @@ Installation
135
135
----------
136
136
.. installation
137
137
138
-
In the NGC container
138
+
Pre-requisites
139
139
^^^^^^^^^^^^^^^^^^^^
140
+
* Linux x86_64
141
+
* CUDA 11.8+ for Hopper and CUDA 12.1+ for Ada
142
+
* NVIDIA Driver supporting CUDA 11.8 or later
143
+
* cuDNN 8.1 or later
144
+
* For fused attention, CUDA 12.1 or later, NVIDIA Driver supporting CUDA 12.1 or later, and cuDNN 8.9 or later.
140
145
141
-
The quickest way to get started with Transformer Engine is the NGC PyTorch container on
142
-
`NVIDIA GPU Cloud Catalog <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch>`_ (versions 22.09 and later).
146
+
Docker
147
+
^^^^^^^^^^^^^^^^^^^^
148
+
149
+
The quickest way to get started with Transformer Engine is by using Docker images on
150
+
`NVIDIA GPU Cloud (NGC) Catalog <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch>`_. For example to use the NGC PyTorch container interactively,
143
151
144
152
.. code-block:: bash
145
153
146
-
docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:23.04-py3
154
+
docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:23.10-py3
147
155
148
-
Where 23.04 is the container version. For example, 23.04 for the April 2023 release.
156
+
Where 23.10 is the container version. For example, 23.10 for the October 2023 release.
149
157
150
-
Pre-requisites
158
+
pip
151
159
^^^^^^^^^^^^^^^^^^^^
152
-
* Linux x86_64
153
-
* CUDA 11.8 or later
154
-
* NVIDIA Driver supporting CUDA 11.8 or later
155
-
* cuDNN 8.1 or later
156
-
* For fused attention, CUDA 12.1 or later, NVIDIA Driver supporting CUDA 12.1 or later, and cuDNN 8.9 or later.
160
+
To install the latest stable version of Transformer Engine,
This will automatically detect if any supported deep learning frameworks are installed and build Transformer Engine support for them. To explicitly specify frameworks, set the environment variable NVTE_FRAMEWORK to a comma-separated list (e.g. NVTE_FRAMEWORK=jax,pytorch).
157
167
158
168
From source
159
169
^^^^^^^^^^^
170
+
`See the installation guide <https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/installation.html#installation-from-source>`_.
160
171
161
-
`See the installation guide <https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/installation.html>`_.
162
-
163
-
Compiling with Flash Attention 2
172
+
Compiling with FlashAttention-2
164
173
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
174
+
Transformer Engine release v0.11.0 adds support for FlashAttention-2 in PyTorch for improved performance.
175
+
176
+
It is a known issue that FlashAttention-2 compilation is resource-intensive and requires a large amount of RAM (see `bug <https://github.com/Dao-AILab/flash-attention/issues/358>`_), which may lead to out of memory errors during the installation of Transformer Engine. Please try setting **MAX_JOBS=1** in the environment to circumvent the issue. If the errors persist, install a supported version of FlashAttention-1 (v1.0.6 to v1.0.9).
165
177
166
-
TransformerEngine release v0.11.0 adds support for Flash Attention 2.0 for improved performance. It is a known issue that Flash Attention 2.0 compilation is
167
-
resource-intensive and requires a large amount of RAM (see `bug <https://github.com/Dao-AILab/flash-attention/issues/358>`_), which may lead to out of memory
168
-
errors during the installation of TransformerEngine. Please try setting **MAX_JOBS=1** in the environment to circumvent the issue. If the errors persist, install a supported version of Flash Attention 1 (v1.0.6 to v1.0.9).
178
+
Note that NGC PyTorch 23.08+ containers include FlashAttention-2.
0 commit comments