You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**Test URL:**[DeepSpeedExamples/training/DeepSpeed-ZenFlow/finetuning](https://github.com/deepspeedai/DeepSpeedExamples/tree/master/training/DeepSpeed-ZenFlow/finetuning) (All following tests are using the same URL)
29
29
@@ -69,7 +69,7 @@ From this data, DeepSpeed's core binding provides approximately a 15% performanc
The result clearly shows that the improved ZenFlow achieves a 2.59x speedup compared to ZeRO Offload without core binding, and a 2.24x speedup compared to ZeRO Offload with core binding.
172
172
@@ -186,9 +186,9 @@ Since we couldn't run Qwen2.5-3B with ZeRO2 using the same config on two GPUs in
186
186
| ZeRO Offload with DeepSpeed core binding | 1365ms | 17.6% |
Based on the tests conducted on 2xA100 GPUs, the practicality metric for ZeRO Offload was 17.6%, while ZenFlow achieved a practicality metric of 42.2%. This result demonstrates that ZenFlow significantly improves the practicality of offloading.
0 commit comments