Does this framework support full parameter PPO tuning for the Qwen2.5-14B model on 8-A100 GPUs with 80GB memory each? #40

hljjjmssyh · 2024-12-07T15:41:04Z

as mentioned in the title.

PeterSH6 · 2024-12-09T03:31:18Z

Hi @hljjjmssyh, I think our framework can support your needs. You can try using FSDP backend + vLLMRollout with tensor_parallel_size=8 and tune the gpu_memory_utilization and other hyper-parameters.

If you encounter OOM, you can turn on the param offload in reference and reward model similar to qwen2_7b example. And you can also try param/grad/optimzier offload in actor/critic model if you prefer a larger micro-batch size for training.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does this framework support full parameter PPO tuning for the Qwen2.5-14B model on 8-A100 GPUs with 80GB memory each? #40

Does this framework support full parameter PPO tuning for the Qwen2.5-14B model on 8-A100 GPUs with 80GB memory each? #40

hljjjmssyh commented Dec 7, 2024

PeterSH6 commented Dec 9, 2024

Does this framework support full parameter PPO tuning for the Qwen2.5-14B model on 8-A100 GPUs with 80GB memory each? #40

Does this framework support full parameter PPO tuning for the Qwen2.5-14B model on 8-A100 GPUs with 80GB memory each? #40

Comments

hljjjmssyh commented Dec 7, 2024

PeterSH6 commented Dec 9, 2024