Skip to content

Commit fb010bf

Browse files
committed
NeMo 2 Performance instructions
1 parent 0defd7f commit fb010bf

File tree

1 file changed

+65
-0
lines changed

1 file changed

+65
-0
lines changed
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# Performance
2+
3+
This document describes the process of performance measurements of NeMo 2.x framework.
4+
5+
* [NVIDIA NeMo Performance Summary](https://docs.nvidia.com/nemo-framework/user-guide/latest/performance/performance-summary.html)
6+
* [NVIDIA NeMo Performance Scripts](https://github.com/NVIDIA/NeMo/tree/main/scripts/performance/llm)
7+
* [NVIDIA NeMo Compatibility Matrix](https://docs.nvidia.com/nemo-framework/user-guide/latest/softwarecomponentversions.html)
8+
9+
### Create Conda Environment
10+
11+
```bash
12+
conda create -yn nemo python=3.12
13+
conda activate nemo
14+
```
15+
16+
### Install NeMo
17+
18+
Make sure that Nemo version is compatible with the docker image according to the [compatibility matrix](https://docs.nvidia.com/nemo-framework/user-guide/latest/softwarecomponentversions.html)
19+
20+
```bash
21+
git clone [email protected]:NVIDIA/NeMo.git
22+
cd NeMo
23+
git checkout v2.5.0rc0
24+
pip install -e '.[all]'
25+
```
26+
27+
Optionally specify where to store the performance results:
28+
29+
```bash
30+
export NEMORUN_HOME=/fsxl/.../nemo_run
31+
```
32+
33+
### Build Docker Image
34+
35+
The docker file is supposed to start with `FROM nvcr.io/nvidia/nemo:YY.MM` and continue with EFA installation. Make sure that the docker image is compatible with the Nemo version according to the [compatibility matrix](https://docs.nvidia.com/nemo-framework/user-guide/latest/softwarecomponentversions.html)
36+
37+
```bash
38+
docker build --progress=plain -t aws-nemo:latest -f Dockerfile .
39+
enroot import -o ~/aws-nemo.sqsh dockerd://aws-nemo:latest
40+
```
41+
42+
### Run Performance Test
43+
44+
To enable EFA just export environment variables:
45+
46+
```bash
47+
export FI_PROVIDER=efa
48+
export NCCL_DEBUG=INFO
49+
```
50+
51+
## Recommended model configs
52+
53+
## NVIDIA H100(also applicable to H200)
54+
55+
`NeMo/scripts/performance/recommended_model_configs/model_configs_h100.csv`
56+
57+
| Model | #-GPUs | GBS | MBS | Sequence Length | TP | PP | CP | VP | EP | GA |
58+
|-----------|--------|-----|-----|-----------------|----|----|----|----|----|----|
59+
| LLAMA3-8B | 8 | 128 | 1 | 8192 | 1 | 1 | 2 | 1 | 1 | 32 |
60+
61+
```bash
62+
python -m scripts.performance.llm.pretrain_llama3_8b \
63+
--account $(whoami) --partition p5en -i ./aws-nemo.sqsh
64+
--gpu h100 --num_gpus 8 -gb 128 -mb 1 -tp 1 -pp 1 -cp 2 -vp 1 -ep 1
65+
```

0 commit comments

Comments
 (0)