Mixed double precision for PPO_RNN algorithm #172

lopatovsky · 2024-07-15T12:48:22Z

Mixed precision

Motivation:

Inspired by RLGames, we implemented automatic mixed double precision to boost performance of PPO_RNN especially for big models.

Sources:

https://pytorch.org/docs/stable/amp.html

Speed eval:

model with one layer of lstm (hidden size: 768, seq_len 128) followed by mlp units: [2048, 1024, 1024, 512]
trained with isaac-sim simulation (so the speed up on skrl side is actually higher than what this test shows)

Mixed-Precision	Time (s)	Speed Factor
No	155	1x
Yes	105	0.677x

Quality eval:

We trained a policy for our task with each of the configurations multiple times. We didn’t observe any statistically significant difference in quality of the final results.

lopatovsky changed the base branch from main to develop July 15, 2024 12:50

lopatovsky changed the title ~~Mixed double precision for PPO algorithm~~ Mixed double precision for PPO_RNN algorithm Jul 15, 2024

Add mixed precision option into ppo_rnn algorithm

dcd4faf

lopatovsky force-pushed the ll_mixed_precision_rnn branch from 8d3709e to dcd4faf Compare July 15, 2024 12:55