Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Deprecation] Remove OrnsteinUhlenbeckProcessWrapper #2749

Merged
merged 3 commits into from
Feb 4, 2025

Conversation

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Feb 3, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2749

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens added a commit that referenced this pull request Feb 3, 2025
ghstack-source-id: cd5fd85ebcc473ccbb0198b13d8ed3a908695cda
Pull Request resolved: #2749
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 3, 2025
[ghstack-poisoned]
Copy link

github-actions bot commented Feb 3, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}16$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.5385s 0.4525s 2.2098 Ops/s 2.1313 Ops/s $\color{#35bf28}+3.69\%$
test_transformed 0.9862s 0.8957s 1.1165 Ops/s 1.0661 Ops/s $\color{#35bf28}+4.73\%$
test_serial 1.4577s 1.3708s 0.7295 Ops/s 0.7101 Ops/s $\color{#35bf28}+2.73\%$
test_parallel 1.3034s 1.2050s 0.8299 Ops/s 0.8201 Ops/s $\color{#35bf28}+1.19\%$
test_step_mdp_speed[True-True-True-True-True] 70.6120μs 30.5062μs 32.7802 KOps/s 32.0065 KOps/s $\color{#35bf28}+2.42\%$
test_step_mdp_speed[True-True-True-True-False] 63.2180μs 18.1086μs 55.2223 KOps/s 55.2442 KOps/s $\color{#d91a1a}-0.04\%$
test_step_mdp_speed[True-True-True-False-True] 44.2630μs 17.2539μs 57.9580 KOps/s 56.4255 KOps/s $\color{#35bf28}+2.72\%$
test_step_mdp_speed[True-True-True-False-False] 47.0680μs 10.1158μs 98.8552 KOps/s 95.9981 KOps/s $\color{#35bf28}+2.98\%$
test_step_mdp_speed[True-True-False-True-True] 63.3180μs 32.7300μs 30.5530 KOps/s 30.2169 KOps/s $\color{#35bf28}+1.11\%$
test_step_mdp_speed[True-True-False-True-False] 73.4310μs 19.9564μs 50.1093 KOps/s 49.6697 KOps/s $\color{#35bf28}+0.88\%$
test_step_mdp_speed[True-True-False-False-True] 48.1300μs 19.0790μs 52.4136 KOps/s 50.8819 KOps/s $\color{#35bf28}+3.01\%$
test_step_mdp_speed[True-True-False-False-False] 0.1244ms 12.0967μs 82.6671 KOps/s 82.0027 KOps/s $\color{#35bf28}+0.81\%$
test_step_mdp_speed[True-False-True-True-True] 70.8020μs 34.1662μs 29.2687 KOps/s 28.0809 KOps/s $\color{#35bf28}+4.23\%$
test_step_mdp_speed[True-False-True-True-False] 0.5206ms 22.1001μs 45.2486 KOps/s 44.5619 KOps/s $\color{#35bf28}+1.54\%$
test_step_mdp_speed[True-False-True-False-True] 53.7400μs 19.1169μs 52.3098 KOps/s 51.4684 KOps/s $\color{#35bf28}+1.63\%$
test_step_mdp_speed[True-False-True-False-False] 40.6060μs 12.0255μs 83.1567 KOps/s 82.2810 KOps/s $\color{#35bf28}+1.06\%$
test_step_mdp_speed[True-False-False-True-True] 77.9060μs 36.0077μs 27.7718 KOps/s 27.3029 KOps/s $\color{#35bf28}+1.72\%$
test_step_mdp_speed[True-False-False-True-False] 0.1642ms 24.8779μs 40.1964 KOps/s 42.2085 KOps/s $\color{#d91a1a}-4.77\%$
test_step_mdp_speed[True-False-False-False-True] 47.9390μs 20.6861μs 48.3416 KOps/s 46.4797 KOps/s $\color{#35bf28}+4.01\%$
test_step_mdp_speed[True-False-False-False-False] 44.1230μs 13.7167μs 72.9039 KOps/s 71.4158 KOps/s $\color{#35bf28}+2.08\%$
test_step_mdp_speed[False-True-True-True-True] 68.5370μs 34.3552μs 29.1077 KOps/s 28.1336 KOps/s $\color{#35bf28}+3.46\%$
test_step_mdp_speed[False-True-True-True-False] 66.8650μs 22.0087μs 45.4367 KOps/s 45.1919 KOps/s $\color{#35bf28}+0.54\%$
test_step_mdp_speed[False-True-True-False-True] 64.7700μs 21.8340μs 45.8002 KOps/s 44.4234 KOps/s $\color{#35bf28}+3.10\%$
test_step_mdp_speed[False-True-True-False-False] 0.1536ms 13.6077μs 73.4881 KOps/s 72.7049 KOps/s $\color{#35bf28}+1.08\%$
test_step_mdp_speed[False-True-False-True-True] 0.1397ms 35.8517μs 27.8927 KOps/s 27.0843 KOps/s $\color{#35bf28}+2.98\%$
test_step_mdp_speed[False-True-False-True-False] 76.2450μs 23.4974μs 42.5579 KOps/s 41.7184 KOps/s $\color{#35bf28}+2.01\%$
test_step_mdp_speed[False-True-False-False-True] 2.7047ms 23.6229μs 42.3319 KOps/s 41.5149 KOps/s $\color{#35bf28}+1.97\%$
test_step_mdp_speed[False-True-False-False-False] 64.8810μs 15.2409μs 65.6129 KOps/s 63.9707 KOps/s $\color{#35bf28}+2.57\%$
test_step_mdp_speed[False-False-True-True-True] 97.1210μs 38.4723μs 25.9927 KOps/s 25.7647 KOps/s $\color{#35bf28}+0.88\%$
test_step_mdp_speed[False-False-True-True-False] 57.5170μs 25.5034μs 39.2104 KOps/s 39.2347 KOps/s $\color{#d91a1a}-0.06\%$
test_step_mdp_speed[False-False-True-False-True] 0.7294ms 23.6185μs 42.3397 KOps/s 41.8881 KOps/s $\color{#35bf28}+1.08\%$
test_step_mdp_speed[False-False-True-False-False] 45.4950μs 15.1905μs 65.8307 KOps/s 64.5491 KOps/s $\color{#35bf28}+1.99\%$
test_step_mdp_speed[False-False-False-True-True] 0.1043ms 39.1352μs 25.5524 KOps/s 25.0745 KOps/s $\color{#35bf28}+1.91\%$
test_step_mdp_speed[False-False-False-True-False] 78.9070μs 27.3074μs 36.6201 KOps/s 36.4402 KOps/s $\color{#35bf28}+0.49\%$
test_step_mdp_speed[False-False-False-False-True] 58.8390μs 25.1163μs 39.8148 KOps/s 39.3061 KOps/s $\color{#35bf28}+1.29\%$
test_step_mdp_speed[False-False-False-False-False] 60.4520μs 17.0851μs 58.5305 KOps/s 58.6359 KOps/s $\color{#d91a1a}-0.18\%$
test_values[generalized_advantage_estimate-True-True] 10.1328ms 9.9227ms 100.7788 Ops/s 102.7838 Ops/s $\color{#d91a1a}-1.95\%$
test_values[vec_generalized_advantage_estimate-True-True] 26.1661ms 24.4169ms 40.9553 Ops/s 39.8832 Ops/s $\color{#35bf28}+2.69\%$
test_values[td0_return_estimate-False-False] 0.2566ms 0.1893ms 5.2819 KOps/s 5.4716 KOps/s $\color{#d91a1a}-3.47\%$
test_values[td1_return_estimate-False-False] 27.9832ms 24.7400ms 40.4204 Ops/s 37.3808 Ops/s $\textbf{\color{#35bf28}+8.13\%}$
test_values[vec_td1_return_estimate-False-False] 26.2928ms 24.4875ms 40.8372 Ops/s 38.3569 Ops/s $\textbf{\color{#35bf28}+6.47\%}$
test_values[td_lambda_return_estimate-True-False] 38.1048ms 35.5111ms 28.1602 Ops/s 28.3477 Ops/s $\color{#d91a1a}-0.66\%$
test_values[vec_td_lambda_return_estimate-True-False] 25.2104ms 24.4156ms 40.9574 Ops/s 39.9034 Ops/s $\color{#35bf28}+2.64\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 10.8689ms 8.3968ms 119.0926 Ops/s 117.9542 Ops/s $\color{#35bf28}+0.97\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.3105ms 1.9923ms 501.9364 Ops/s 505.8543 Ops/s $\color{#d91a1a}-0.77\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4842ms 0.3625ms 2.7589 KOps/s 2.6822 KOps/s $\color{#35bf28}+2.86\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 47.4561ms 43.9921ms 22.7314 Ops/s 22.6760 Ops/s $\color{#35bf28}+0.24\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.2260ms 3.4405ms 290.6529 Ops/s 290.5531 Ops/s $\color{#35bf28}+0.03\%$
test_dqn_speed[False-None] 6.1430ms 1.3841ms 722.4960 Ops/s 698.7728 Ops/s $\color{#35bf28}+3.39\%$
test_dqn_speed[False-backward] 1.9234ms 1.8636ms 536.6079 Ops/s 521.4457 Ops/s $\color{#35bf28}+2.91\%$
test_dqn_speed[True-None] 0.7462ms 0.4779ms 2.0924 KOps/s 2.0745 KOps/s $\color{#35bf28}+0.86\%$
test_dqn_speed[True-backward] 0.9649ms 0.9040ms 1.1062 KOps/s 817.9009 Ops/s $\textbf{\color{#35bf28}+35.25\%}$
test_dqn_speed[reduce-overhead-None] 0.6257ms 0.4912ms 2.0357 KOps/s 2.0685 KOps/s $\color{#d91a1a}-1.59\%$
test_dqn_speed[reduce-overhead-backward] 0.9726ms 0.9057ms 1.1041 KOps/s 1.0882 KOps/s $\color{#35bf28}+1.47\%$
test_ddpg_speed[False-None] 3.5682ms 2.8628ms 349.3094 Ops/s 344.5965 Ops/s $\color{#35bf28}+1.37\%$
test_ddpg_speed[False-backward] 4.1678ms 3.9800ms 251.2584 Ops/s 248.3179 Ops/s $\color{#35bf28}+1.18\%$
test_ddpg_speed[True-None] 1.4268ms 1.2168ms 821.8342 Ops/s 814.5801 Ops/s $\color{#35bf28}+0.89\%$
test_ddpg_speed[True-backward] 2.2309ms 2.1105ms 473.8105 Ops/s 469.5398 Ops/s $\color{#35bf28}+0.91\%$
test_ddpg_speed[reduce-overhead-None] 1.9624ms 1.2168ms 821.8368 Ops/s 814.2116 Ops/s $\color{#35bf28}+0.94\%$
test_ddpg_speed[reduce-overhead-backward] 2.2694ms 2.1385ms 467.6244 Ops/s 467.5232 Ops/s $\color{#35bf28}+0.02\%$
test_sac_speed[False-None] 9.6699ms 7.9449ms 125.8661 Ops/s 121.6284 Ops/s $\color{#35bf28}+3.48\%$
test_sac_speed[False-backward] 11.8226ms 10.6427ms 93.9613 Ops/s 91.7044 Ops/s $\color{#35bf28}+2.46\%$
test_sac_speed[True-None] 2.6099ms 2.0725ms 482.5055 Ops/s 476.5743 Ops/s $\color{#35bf28}+1.24\%$
test_sac_speed[True-backward] 3.8391ms 3.7400ms 267.3820 Ops/s 265.6439 Ops/s $\color{#35bf28}+0.65\%$
test_sac_speed[reduce-overhead-None] 2.3908ms 2.0608ms 485.2601 Ops/s 460.9088 Ops/s $\textbf{\color{#35bf28}+5.28\%}$
test_sac_speed[reduce-overhead-backward] 3.8765ms 3.7183ms 268.9397 Ops/s 265.3098 Ops/s $\color{#35bf28}+1.37\%$
test_redq_speed[False-None] 14.9835ms 12.9105ms 77.4562 Ops/s 64.7067 Ops/s $\textbf{\color{#35bf28}+19.70\%}$
test_redq_speed[False-backward] 24.2143ms 22.1678ms 45.1104 Ops/s 42.2535 Ops/s $\textbf{\color{#35bf28}+6.76\%}$
test_redq_speed[True-None] 6.5935ms 5.4461ms 183.6180 Ops/s 198.3841 Ops/s $\textbf{\color{#d91a1a}-7.44\%}$
test_redq_speed[True-backward] 12.9472ms 12.2472ms 81.6515 Ops/s 76.6501 Ops/s $\textbf{\color{#35bf28}+6.52\%}$
test_redq_speed[reduce-overhead-None] 5.8210ms 4.8792ms 204.9521 Ops/s 192.9037 Ops/s $\textbf{\color{#35bf28}+6.25\%}$
test_redq_speed[reduce-overhead-backward] 13.4400ms 12.7811ms 78.2405 Ops/s 76.5418 Ops/s $\color{#35bf28}+2.22\%$
test_redq_deprec_speed[False-None] 14.0616ms 12.9785ms 77.0504 Ops/s 73.5156 Ops/s $\color{#35bf28}+4.81\%$
test_redq_deprec_speed[False-backward] 21.1233ms 18.8059ms 53.1747 Ops/s 52.5187 Ops/s $\color{#35bf28}+1.25\%$
test_redq_deprec_speed[True-None] 4.3734ms 3.8021ms 263.0130 Ops/s 253.5225 Ops/s $\color{#35bf28}+3.74\%$
test_redq_deprec_speed[True-backward] 9.6193ms 8.2629ms 121.0223 Ops/s 117.1130 Ops/s $\color{#35bf28}+3.34\%$
test_redq_deprec_speed[reduce-overhead-None] 4.4747ms 3.8043ms 262.8626 Ops/s 244.6851 Ops/s $\textbf{\color{#35bf28}+7.43\%}$
test_redq_deprec_speed[reduce-overhead-backward] 9.6293ms 8.3282ms 120.0734 Ops/s 115.8333 Ops/s $\color{#35bf28}+3.66\%$
test_td3_speed[False-None] 8.2528ms 7.8632ms 127.1742 Ops/s 119.2893 Ops/s $\textbf{\color{#35bf28}+6.61\%}$
test_td3_speed[False-backward] 11.5364ms 10.2402ms 97.6545 Ops/s 93.7840 Ops/s $\color{#35bf28}+4.13\%$
test_td3_speed[True-None] 1.9639ms 1.7698ms 565.0463 Ops/s 543.6380 Ops/s $\color{#35bf28}+3.94\%$
test_td3_speed[True-backward] 3.3750ms 3.3283ms 300.4527 Ops/s 269.3413 Ops/s $\textbf{\color{#35bf28}+11.55\%}$
test_td3_speed[reduce-overhead-None] 2.0144ms 1.7629ms 567.2613 Ops/s 537.0168 Ops/s $\textbf{\color{#35bf28}+5.63\%}$
test_td3_speed[reduce-overhead-backward] 3.7028ms 3.3587ms 297.7338 Ops/s 279.9952 Ops/s $\textbf{\color{#35bf28}+6.34\%}$
test_cql_speed[False-None] 39.3703ms 36.3751ms 27.4913 Ops/s 26.1032 Ops/s $\textbf{\color{#35bf28}+5.32\%}$
test_cql_speed[False-backward] 47.8842ms 45.8519ms 21.8094 Ops/s 20.5503 Ops/s $\textbf{\color{#35bf28}+6.13\%}$
test_cql_speed[True-None] 17.1848ms 16.0471ms 62.3165 Ops/s 60.4715 Ops/s $\color{#35bf28}+3.05\%$
test_cql_speed[True-backward] 24.3877ms 23.0368ms 43.4087 Ops/s 42.3547 Ops/s $\color{#35bf28}+2.49\%$
test_cql_speed[reduce-overhead-None] 16.8465ms 16.1810ms 61.8007 Ops/s 60.4613 Ops/s $\color{#35bf28}+2.22\%$
test_cql_speed[reduce-overhead-backward] 24.9759ms 23.3070ms 42.9055 Ops/s 43.0582 Ops/s $\color{#d91a1a}-0.35\%$
test_a2c_speed[False-None] 8.5634ms 7.1383ms 140.0895 Ops/s 136.4124 Ops/s $\color{#35bf28}+2.70\%$
test_a2c_speed[False-backward] 15.1896ms 14.4836ms 69.0437 Ops/s 67.6404 Ops/s $\color{#35bf28}+2.07\%$
test_a2c_speed[True-None] 4.3477ms 3.6952ms 270.6242 Ops/s 263.8664 Ops/s $\color{#35bf28}+2.56\%$
test_a2c_speed[True-backward] 11.1108ms 10.2229ms 97.8196 Ops/s 96.3820 Ops/s $\color{#35bf28}+1.49\%$
test_a2c_speed[reduce-overhead-None] 4.5052ms 3.7059ms 269.8429 Ops/s 265.2695 Ops/s $\color{#35bf28}+1.72\%$
test_a2c_speed[reduce-overhead-backward] 10.8659ms 10.4754ms 95.4619 Ops/s 97.5742 Ops/s $\color{#d91a1a}-2.16\%$
test_ppo_speed[False-None] 8.2111ms 7.4366ms 134.4698 Ops/s 132.5061 Ops/s $\color{#35bf28}+1.48\%$
test_ppo_speed[False-backward] 15.8939ms 14.9259ms 66.9975 Ops/s 67.1482 Ops/s $\color{#d91a1a}-0.22\%$
test_ppo_speed[True-None] 4.7826ms 4.1279ms 242.2545 Ops/s 243.9470 Ops/s $\color{#d91a1a}-0.69\%$
test_ppo_speed[True-backward] 10.2648ms 10.0193ms 99.8075 Ops/s 98.7622 Ops/s $\color{#35bf28}+1.06\%$
test_ppo_speed[reduce-overhead-None] 4.5722ms 4.1298ms 242.1448 Ops/s 243.3822 Ops/s $\color{#d91a1a}-0.51\%$
test_ppo_speed[reduce-overhead-backward] 10.6466ms 10.1899ms 98.1365 Ops/s 98.7980 Ops/s $\color{#d91a1a}-0.67\%$
test_reinforce_speed[False-None] 7.3189ms 6.6154ms 151.1614 Ops/s 150.7494 Ops/s $\color{#35bf28}+0.27\%$
test_reinforce_speed[False-backward] 10.5656ms 9.9311ms 100.6933 Ops/s 100.5482 Ops/s $\color{#35bf28}+0.14\%$
test_reinforce_speed[True-None] 4.2592ms 3.1259ms 319.9033 Ops/s 322.4100 Ops/s $\color{#d91a1a}-0.78\%$
test_reinforce_speed[True-backward] 9.9864ms 9.3105ms 107.4053 Ops/s 111.2209 Ops/s $\color{#d91a1a}-3.43\%$
test_reinforce_speed[reduce-overhead-None] 3.9815ms 3.1419ms 318.2778 Ops/s 326.7220 Ops/s $\color{#d91a1a}-2.58\%$
test_reinforce_speed[reduce-overhead-backward] 9.2618ms 9.0088ms 111.0032 Ops/s 109.5046 Ops/s $\color{#35bf28}+1.37\%$
test_iql_speed[False-None] 32.9179ms 32.2048ms 31.0512 Ops/s 29.9798 Ops/s $\color{#35bf28}+3.57\%$
test_iql_speed[False-backward] 46.8196ms 45.2421ms 22.1033 Ops/s 21.0644 Ops/s $\color{#35bf28}+4.93\%$
test_iql_speed[True-None] 11.9510ms 11.3024ms 88.4767 Ops/s 86.7891 Ops/s $\color{#35bf28}+1.94\%$
test_iql_speed[True-backward] 24.4010ms 23.0825ms 43.3229 Ops/s 44.0341 Ops/s $\color{#d91a1a}-1.62\%$
test_iql_speed[reduce-overhead-None] 11.8887ms 11.4106ms 87.6380 Ops/s 86.7799 Ops/s $\color{#35bf28}+0.99\%$
test_iql_speed[reduce-overhead-backward] 27.4304ms 22.9106ms 43.6478 Ops/s 44.6794 Ops/s $\color{#d91a1a}-2.31\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.0799ms 5.0542ms 197.8565 Ops/s 203.5750 Ops/s $\color{#d91a1a}-2.81\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9961ms 0.5289ms 1.8907 KOps/s 1.9018 KOps/s $\color{#d91a1a}-0.59\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 1.2038ms 0.5221ms 1.9155 KOps/s 2.0385 KOps/s $\textbf{\color{#d91a1a}-6.04\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.4812ms 4.8403ms 206.6004 Ops/s 206.2278 Ops/s $\color{#35bf28}+0.18\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.1862ms 0.5186ms 1.9281 KOps/s 1.9844 KOps/s $\color{#d91a1a}-2.84\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6971ms 0.4899ms 2.0410 KOps/s 2.0768 KOps/s $\color{#d91a1a}-1.72\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.4386ms 1.6476ms 606.9585 Ops/s 604.0803 Ops/s $\color{#35bf28}+0.48\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.3528ms 1.5738ms 635.4071 Ops/s 640.5159 Ops/s $\color{#d91a1a}-0.80\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.7708ms 4.9261ms 203.0017 Ops/s 208.4854 Ops/s $\color{#d91a1a}-2.63\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0289ms 0.6454ms 1.5493 KOps/s 1.5323 KOps/s $\color{#35bf28}+1.11\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.1084ms 0.6246ms 1.6011 KOps/s 1.5426 KOps/s $\color{#35bf28}+3.80\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.0702ms 4.7940ms 208.5958 Ops/s 211.6494 Ops/s $\color{#d91a1a}-1.44\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.6287ms 0.5196ms 1.9247 KOps/s 1.9426 KOps/s $\color{#d91a1a}-0.92\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6997ms 0.4917ms 2.0338 KOps/s 2.0014 KOps/s $\color{#35bf28}+1.62\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.9987ms 4.6468ms 215.2016 Ops/s 218.4767 Ops/s $\color{#d91a1a}-1.50\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.0526ms 0.5096ms 1.9623 KOps/s 1.9688 KOps/s $\color{#d91a1a}-0.33\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7557ms 0.4792ms 2.0869 KOps/s 2.0794 KOps/s $\color{#35bf28}+0.36\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.3703ms 4.7439ms 210.7987 Ops/s 209.7680 Ops/s $\color{#35bf28}+0.49\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.4569s 1.2973ms 770.8517 Ops/s 1.5157 KOps/s $\textbf{\color{#d91a1a}-49.14\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8316ms 0.6222ms 1.6071 KOps/s 1.5987 KOps/s $\color{#35bf28}+0.53\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 8.5258ms 4.2504ms 235.2695 Ops/s 235.9699 Ops/s $\color{#d91a1a}-0.30\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.4709ms 2.1768ms 459.3984 Ops/s 457.1004 Ops/s $\color{#35bf28}+0.50\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.1948ms 1.4085ms 709.9685 Ops/s 802.5138 Ops/s $\textbf{\color{#d91a1a}-11.53\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 5.6108ms 4.2171ms 237.1270 Ops/s 35.5100 Ops/s $\textbf{\color{#35bf28}+567.78\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.4079s 10.4500ms 95.6934 Ops/s 429.0369 Ops/s $\textbf{\color{#d91a1a}-77.70\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 5.4113ms 1.3727ms 728.4774 Ops/s 800.3504 Ops/s $\textbf{\color{#d91a1a}-8.98\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 5.7036ms 4.4086ms 226.8274 Ops/s 219.5903 Ops/s $\color{#35bf28}+3.30\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.8933ms 2.5347ms 394.5295 Ops/s 401.4174 Ops/s $\color{#d91a1a}-1.72\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 4.8021ms 1.5228ms 656.6788 Ops/s 638.8805 Ops/s $\color{#35bf28}+2.79\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 11.8962ms 11.6162ms 86.0870 Ops/s 82.5250 Ops/s $\color{#35bf28}+4.32\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.4736ms 14.2967ms 69.9464 Ops/s 68.3682 Ops/s $\color{#35bf28}+2.31\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.8737ms 20.5245ms 48.7222 Ops/s 47.6444 Ops/s $\color{#35bf28}+2.26\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.5643ms 14.5165ms 68.8873 Ops/s 67.2573 Ops/s $\color{#35bf28}+2.42\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.7557ms 20.6895ms 48.3338 Ops/s 47.8319 Ops/s $\color{#35bf28}+1.05\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.0661ms 15.7672ms 63.4227 Ops/s 62.0860 Ops/s $\color{#35bf28}+2.15\%$

Copy link

github-actions bot commented Feb 3, 2025

Result of GPU Benchmark Tests

Expand to view detailed results
Name Max Mean Ops
test_simple 0.8501s 0.7535s 1.3271 Ops/s
test_transformed 1.4342s 1.3412s 0.7456 Ops/s
test_serial 2.2233s 2.1792s 0.4589 Ops/s
test_parallel 1.9022s 1.8449s 0.5420 Ops/s
test_step_mdp_speed[True-True-True-True-True] 0.1876ms 40.1645μs 24.8976 KOps/s
test_step_mdp_speed[True-True-True-True-False] 50.2710μs 24.1680μs 41.3769 KOps/s
test_step_mdp_speed[True-True-True-False-True] 49.3600μs 22.6329μs 44.1835 KOps/s
test_step_mdp_speed[True-True-True-False-False] 42.1110μs 13.1378μs 76.1163 KOps/s
test_step_mdp_speed[True-True-False-True-True] 80.0710μs 42.7514μs 23.3910 KOps/s
test_step_mdp_speed[True-True-False-True-False] 65.4710μs 26.3790μs 37.9089 KOps/s
test_step_mdp_speed[True-True-False-False-True] 59.0210μs 24.7887μs 40.3409 KOps/s
test_step_mdp_speed[True-True-False-False-False] 59.0810μs 15.4137μs 64.8775 KOps/s
test_step_mdp_speed[True-False-True-True-True] 77.8210μs 44.7144μs 22.3642 KOps/s
test_step_mdp_speed[True-False-True-True-False] 68.3210μs 28.1493μs 35.5248 KOps/s
test_step_mdp_speed[True-False-True-False-True] 56.9210μs 25.0029μs 39.9954 KOps/s
test_step_mdp_speed[True-False-True-False-False] 39.5600μs 15.5379μs 64.3590 KOps/s
test_step_mdp_speed[True-False-False-True-True] 91.2120μs 47.3355μs 21.1258 KOps/s
test_step_mdp_speed[True-False-False-True-False] 63.7010μs 30.4866μs 32.8013 KOps/s
test_step_mdp_speed[True-False-False-False-True] 62.4110μs 27.2882μs 36.6459 KOps/s
test_step_mdp_speed[True-False-False-False-False] 45.0600μs 17.9444μs 55.7277 KOps/s
test_step_mdp_speed[False-True-True-True-True] 72.2310μs 45.8395μs 21.8152 KOps/s
test_step_mdp_speed[False-True-True-True-False] 73.2310μs 28.2334μs 35.4190 KOps/s
test_step_mdp_speed[False-True-True-False-True] 2.5106ms 29.5230μs 33.8719 KOps/s
test_step_mdp_speed[False-True-True-False-False] 45.1710μs 17.6866μs 56.5398 KOps/s
test_step_mdp_speed[False-True-False-True-True] 83.6710μs 47.7303μs 20.9511 KOps/s
test_step_mdp_speed[False-True-False-True-False] 62.6310μs 30.9753μs 32.2838 KOps/s
test_step_mdp_speed[False-True-False-False-True] 64.7110μs 31.0996μs 32.1548 KOps/s
test_step_mdp_speed[False-True-False-False-False] 45.7210μs 19.5508μs 51.1487 KOps/s
test_step_mdp_speed[False-False-True-True-True] 83.3110μs 50.4698μs 19.8138 KOps/s
test_step_mdp_speed[False-False-True-True-False] 76.5110μs 32.9274μs 30.3698 KOps/s
test_step_mdp_speed[False-False-True-False-True] 58.6600μs 30.5821μs 32.6988 KOps/s
test_step_mdp_speed[False-False-True-False-False] 47.3600μs 19.4648μs 51.3749 KOps/s
test_step_mdp_speed[False-False-False-True-True] 79.5510μs 51.7981μs 19.3057 KOps/s
test_step_mdp_speed[False-False-False-True-False] 64.5910μs 35.6685μs 28.0359 KOps/s
test_step_mdp_speed[False-False-False-False-True] 68.5910μs 32.6241μs 30.6522 KOps/s
test_step_mdp_speed[False-False-False-False-False] 60.1000μs 21.9427μs 45.5733 KOps/s
test_values[generalized_advantage_estimate-True-True] 24.8643ms 24.1868ms 41.3449 Ops/s
test_values[vec_generalized_advantage_estimate-True-True] 0.1061s 3.0152ms 331.6555 Ops/s
test_values[td0_return_estimate-False-False] 0.1230ms 80.0962μs 12.4850 KOps/s
test_values[td1_return_estimate-False-False] 56.1889ms 55.2808ms 18.0895 Ops/s
test_values[vec_td1_return_estimate-False-False] 1.3154ms 1.0765ms 928.9388 Ops/s
test_values[td_lambda_return_estimate-True-False] 89.5508ms 88.0863ms 11.3525 Ops/s
test_values[vec_td_lambda_return_estimate-True-False] 1.4272ms 1.0772ms 928.3648 Ops/s
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.0374ms 24.2113ms 41.3030 Ops/s
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0151ms 0.7448ms 1.3426 KOps/s
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7461ms 0.6699ms 1.4928 KOps/s
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6689ms 1.4851ms 673.3768 Ops/s
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7446ms 0.6904ms 1.4483 KOps/s
test_dqn_speed[False-None] 1.6680ms 1.5385ms 649.9922 Ops/s
test_dqn_speed[False-backward] 2.3557ms 2.1271ms 470.1188 Ops/s
test_dqn_speed[True-None] 0.7198ms 0.5661ms 1.7664 KOps/s
test_dqn_speed[True-backward] 1.2170ms 1.1341ms 881.7697 Ops/s
test_dqn_speed[reduce-overhead-None] 0.9954ms 0.5831ms 1.7149 KOps/s
test_dqn_speed[reduce-overhead-backward] 1.0490ms 1.0033ms 996.7433 Ops/s
test_ddpg_speed[False-None] 3.3331ms 2.8988ms 344.9726 Ops/s
test_ddpg_speed[False-backward] 4.5948ms 4.1343ms 241.8766 Ops/s
test_ddpg_speed[True-None] 1.9110ms 1.3932ms 717.7738 Ops/s
test_ddpg_speed[True-backward] 2.5515ms 2.4456ms 408.9020 Ops/s
test_ddpg_speed[reduce-overhead-None] 1.4610ms 1.3724ms 728.6342 Ops/s
test_ddpg_speed[reduce-overhead-backward] 1.9449ms 1.9043ms 525.1285 Ops/s
test_sac_speed[False-None] 8.4076ms 8.0315ms 124.5100 Ops/s
test_sac_speed[False-backward] 11.5700ms 10.9853ms 91.0308 Ops/s
test_sac_speed[True-None] 2.0357ms 1.8858ms 530.2686 Ops/s
test_sac_speed[True-backward] 3.8203ms 3.5966ms 278.0386 Ops/s
test_sac_speed[reduce-overhead-None] 20.9919ms 12.0453ms 83.0203 Ops/s
test_sac_speed[reduce-overhead-backward] 1.6703ms 1.6172ms 618.3589 Ops/s
test_redq_speed[False-None] 8.3051ms 7.5497ms 132.4550 Ops/s
test_redq_speed[False-backward] 11.4623ms 11.0743ms 90.2996 Ops/s
test_redq_speed[True-None] 2.3987ms 2.3141ms 432.1255 Ops/s
test_redq_speed[True-backward] 4.4380ms 4.0240ms 248.5104 Ops/s
test_redq_speed[reduce-overhead-None] 2.5233ms 2.3570ms 424.2719 Ops/s
test_redq_speed[reduce-overhead-backward] 4.4721ms 4.0426ms 247.3666 Ops/s
test_redq_deprec_speed[False-None] 9.3281ms 9.0078ms 111.0146 Ops/s
test_redq_deprec_speed[False-backward] 12.1511ms 11.8443ms 84.4286 Ops/s
test_redq_deprec_speed[True-None] 2.7221ms 2.6672ms 374.9237 Ops/s
test_redq_deprec_speed[True-backward] 4.7308ms 4.3459ms 230.1024 Ops/s
test_redq_deprec_speed[reduce-overhead-None] 2.7947ms 2.6524ms 377.0103 Ops/s
test_redq_deprec_speed[reduce-overhead-backward] 4.8047ms 4.3633ms 229.1836 Ops/s
test_td3_speed[False-None] 8.3425ms 7.9728ms 125.4260 Ops/s
test_td3_speed[False-backward] 10.9349ms 10.1320ms 98.6967 Ops/s
test_td3_speed[True-None] 1.8174ms 1.7180ms 582.0873 Ops/s
test_td3_speed[True-backward] 3.2481ms 3.2127ms 311.2657 Ops/s
test_td3_speed[reduce-overhead-None] 54.8325ms 26.7370ms 37.4014 Ops/s
test_td3_speed[reduce-overhead-backward] 1.3835ms 1.3474ms 742.1538 Ops/s
test_cql_speed[False-None] 17.2454ms 16.7077ms 59.8526 Ops/s
test_cql_speed[False-backward] 22.6238ms 21.6323ms 46.2271 Ops/s
test_cql_speed[True-None] 3.4591ms 3.3239ms 300.8503 Ops/s
test_cql_speed[True-backward] 6.0773ms 5.6249ms 177.7796 Ops/s
test_cql_speed[reduce-overhead-None] 21.3821ms 13.2342ms 75.5619 Ops/s
test_cql_speed[reduce-overhead-backward] 1.9813ms 1.8313ms 546.0702 Ops/s
test_a2c_speed[False-None] 3.3678ms 3.1707ms 315.3834 Ops/s
test_a2c_speed[False-backward] 6.5537ms 5.9535ms 167.9687 Ops/s
test_a2c_speed[True-None] 1.4327ms 1.3642ms 733.0490 Ops/s
test_a2c_speed[True-backward] 2.9613ms 2.9222ms 342.2043 Ops/s
test_a2c_speed[reduce-overhead-None] 15.8757ms 9.0758ms 110.1832 Ops/s
test_a2c_speed[reduce-overhead-backward] 1.5480ms 1.4599ms 684.9804 Ops/s
test_ppo_speed[False-None] 3.7793ms 3.6827ms 271.5377 Ops/s
test_ppo_speed[False-backward] 7.1959ms 6.7194ms 148.8223 Ops/s
test_ppo_speed[True-None] 1.5673ms 1.4314ms 698.6226 Ops/s
test_ppo_speed[True-backward] 3.2092ms 3.0750ms 325.2014 Ops/s
test_ppo_speed[reduce-overhead-None] 1.0721ms 0.9789ms 1.0216 KOps/s
test_ppo_speed[reduce-overhead-backward] 1.5493ms 1.4110ms 708.6921 Ops/s
test_reinforce_speed[False-None] 2.5041ms 2.3025ms 434.3057 Ops/s
test_reinforce_speed[False-backward] 3.8451ms 3.2878ms 304.1565 Ops/s
test_reinforce_speed[True-None] 1.4220ms 1.3153ms 760.2802 Ops/s
test_reinforce_speed[True-backward] 3.0718ms 2.9742ms 336.2221 Ops/s
test_reinforce_speed[reduce-overhead-None] 17.5482ms 9.6337ms 103.8026 Ops/s
test_reinforce_speed[reduce-overhead-backward] 1.5562ms 1.4890ms 671.5825 Ops/s
test_iql_speed[False-None] 9.6593ms 9.1577ms 109.1977 Ops/s
test_iql_speed[False-backward] 13.1730ms 12.6548ms 79.0211 Ops/s
test_iql_speed[True-None] 2.4815ms 2.2893ms 436.8089 Ops/s
test_iql_speed[True-backward] 5.4131ms 4.8879ms 204.5853 Ops/s
test_iql_speed[reduce-overhead-None] 18.5836ms 11.1950ms 89.3259 Ops/s
test_iql_speed[reduce-overhead-backward] 1.9633ms 1.9001ms 526.2850 Ops/s
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.9884ms 6.3716ms 156.9460 Ops/s
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5417ms 0.2719ms 3.6777 KOps/s
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5646ms 0.2501ms 3.9981 KOps/s
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3722ms 6.0749ms 164.6115 Ops/s
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.1323ms 0.3402ms 2.9398 KOps/s
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5905ms 0.3240ms 3.0862 KOps/s
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5713ms 1.3912ms 718.8108 Ops/s
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4924ms 1.3097ms 763.5063 Ops/s
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.3775ms 6.2442ms 160.1489 Ops/s
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.2761ms 0.4038ms 2.4765 KOps/s
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7615ms 0.4022ms 2.4865 KOps/s
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3552ms 6.1514ms 162.5658 Ops/s
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.0870ms 0.3848ms 2.5987 KOps/s
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5208ms 0.3563ms 2.8068 KOps/s
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 8.3926ms 6.0823ms 164.4109 Ops/s
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7386ms 0.3333ms 3.0003 KOps/s
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5859ms 0.3137ms 3.1875 KOps/s
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.7318ms 6.2599ms 159.7461 Ops/s
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1763ms 0.4759ms 2.1012 KOps/s
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7018ms 0.4295ms 2.3282 KOps/s
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.0759ms 5.4402ms 183.8161 Ops/s
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 10.6976ms 2.0848ms 479.6580 Ops/s
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.1888ms 1.2118ms 825.2181 Ops/s
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.0599ms 5.5194ms 181.1786 Ops/s
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 5.5712ms 2.0142ms 496.4800 Ops/s
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 10.3240ms 1.2867ms 777.2010 Ops/s
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4859s 15.3003ms 65.3581 Ops/s
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.1300ms 2.1521ms 464.6581 Ops/s
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.4917ms 1.2086ms 827.3859 Ops/s
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.4399ms 13.2357ms 75.5533 Ops/s
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 17.9568ms 16.3487ms 61.1668 Ops/s
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.7811ms 18.0539ms 55.3898 Ops/s
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 18.8339ms 16.9752ms 58.9094 Ops/s
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 19.4165ms 18.3884ms 54.3821 Ops/s
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 20.3019ms 18.5178ms 54.0021 Ops/s

[ghstack-poisoned]
@vmoens vmoens added bc breaking backward compatibility breaking change Deprecation labels Feb 4, 2025
@vmoens vmoens merged commit f54d7ad into gh/vmoens/92/base Feb 4, 2025
34 of 52 checks passed
vmoens added a commit that referenced this pull request Feb 4, 2025
ghstack-source-id: 401fdfaca2e27122d5a67fc7177e1015047f0098
Pull Request resolved: #2749
@vmoens vmoens deleted the gh/vmoens/92/head branch February 4, 2025 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bc breaking backward compatibility breaking change CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Deprecation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants