Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Deprecation] Change the default MLP depth #2746

Merged
merged 4 commits into from
Feb 4, 2025

Conversation

[ghstack-poisoned]
@vmoens vmoens mentioned this pull request Feb 3, 2025
Copy link

pytorch-bot bot commented Feb 3, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2746

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens added a commit that referenced this pull request Feb 3, 2025
ghstack-source-id: 53fab8efaad48cf2913a5a0290f2249b6b032fde
Pull Request resolved: #2746
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 3, 2025
[ghstack-poisoned]
Copy link

github-actions bot commented Feb 3, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}18$. Worsened: $\large\color{#d91a1a}2$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.5280s 0.4439s 2.2528 Ops/s 2.1799 Ops/s $\color{#35bf28}+3.34\%$
test_transformed 1.0173s 0.9258s 1.0801 Ops/s 1.0723 Ops/s $\color{#35bf28}+0.73\%$
test_serial 1.4534s 1.3643s 0.7330 Ops/s 0.7174 Ops/s $\color{#35bf28}+2.17\%$
test_parallel 1.3143s 1.2189s 0.8204 Ops/s 0.8158 Ops/s $\color{#35bf28}+0.56\%$
test_step_mdp_speed[True-True-True-True-True] 0.1655ms 29.9914μs 33.3429 KOps/s 32.6156 KOps/s $\color{#35bf28}+2.23\%$
test_step_mdp_speed[True-True-True-True-False] 50.0130μs 17.5364μs 57.0241 KOps/s 55.4948 KOps/s $\color{#35bf28}+2.76\%$
test_step_mdp_speed[True-True-True-False-True] 51.3660μs 16.9959μs 58.8377 KOps/s 57.6948 KOps/s $\color{#35bf28}+1.98\%$
test_step_mdp_speed[True-True-True-False-False] 37.2800μs 9.9084μs 100.9245 KOps/s 98.9375 KOps/s $\color{#35bf28}+2.01\%$
test_step_mdp_speed[True-True-False-True-True] 70.5720μs 32.5244μs 30.7462 KOps/s 29.0365 KOps/s $\textbf{\color{#35bf28}+5.89\%}$
test_step_mdp_speed[True-True-False-True-False] 60.1820μs 19.5699μs 51.0988 KOps/s 50.1858 KOps/s $\color{#35bf28}+1.82\%$
test_step_mdp_speed[True-True-False-False-True] 49.6420μs 19.0700μs 52.4383 KOps/s 51.7877 KOps/s $\color{#35bf28}+1.26\%$
test_step_mdp_speed[True-True-False-False-False] 41.5880μs 11.7569μs 85.0565 KOps/s 82.6168 KOps/s $\color{#35bf28}+2.95\%$
test_step_mdp_speed[True-False-True-True-True] 82.1730μs 34.2204μs 29.2223 KOps/s 29.0028 KOps/s $\color{#35bf28}+0.76\%$
test_step_mdp_speed[True-False-True-True-False] 53.7200μs 21.5138μs 46.4818 KOps/s 45.8692 KOps/s $\color{#35bf28}+1.34\%$
test_step_mdp_speed[True-False-True-False-True] 57.9480μs 18.8831μs 52.9573 KOps/s 51.8319 KOps/s $\color{#35bf28}+2.17\%$
test_step_mdp_speed[True-False-True-False-False] 59.8920μs 11.7616μs 85.0224 KOps/s 83.5236 KOps/s $\color{#35bf28}+1.79\%$
test_step_mdp_speed[True-False-False-True-True] 80.4410μs 35.6673μs 28.0369 KOps/s 27.7916 KOps/s $\color{#35bf28}+0.88\%$
test_step_mdp_speed[True-False-False-True-False] 0.6447ms 23.0582μs 43.3685 KOps/s 42.3805 KOps/s $\color{#35bf28}+2.33\%$
test_step_mdp_speed[True-False-False-False-True] 65.1010μs 20.5345μs 48.6986 KOps/s 47.4756 KOps/s $\color{#35bf28}+2.58\%$
test_step_mdp_speed[True-False-False-False-False] 46.1670μs 13.5191μs 73.9696 KOps/s 72.6732 KOps/s $\color{#35bf28}+1.78\%$
test_step_mdp_speed[False-True-True-True-True] 0.1264ms 33.6292μs 29.7360 KOps/s 29.0542 KOps/s $\color{#35bf28}+2.35\%$
test_step_mdp_speed[False-True-True-True-False] 53.1100μs 21.3978μs 46.7337 KOps/s 45.8231 KOps/s $\color{#35bf28}+1.99\%$
test_step_mdp_speed[False-True-True-False-True] 0.1594ms 21.9834μs 45.4889 KOps/s 46.4460 KOps/s $\color{#d91a1a}-2.06\%$
test_step_mdp_speed[False-True-True-False-False] 0.3134ms 13.8996μs 71.9447 KOps/s 75.0548 KOps/s $\color{#d91a1a}-4.14\%$
test_step_mdp_speed[False-True-False-True-True] 85.0790μs 35.8795μs 27.8711 KOps/s 27.7941 KOps/s $\color{#35bf28}+0.28\%$
test_step_mdp_speed[False-True-False-True-False] 58.9000μs 23.1994μs 43.1046 KOps/s 42.5182 KOps/s $\color{#35bf28}+1.38\%$
test_step_mdp_speed[False-True-False-False-True] 2.6474ms 23.7772μs 42.0571 KOps/s 42.5536 KOps/s $\color{#d91a1a}-1.17\%$
test_step_mdp_speed[False-True-False-False-False] 46.6070μs 15.0700μs 66.3570 KOps/s 65.9263 KOps/s $\color{#35bf28}+0.65\%$
test_step_mdp_speed[False-False-True-True-True] 75.3410μs 37.5269μs 26.6475 KOps/s 26.2573 KOps/s $\color{#35bf28}+1.49\%$
test_step_mdp_speed[False-False-True-True-False] 58.6790μs 24.9660μs 40.0545 KOps/s 39.5307 KOps/s $\color{#35bf28}+1.33\%$
test_step_mdp_speed[False-False-True-False-True] 77.5630μs 23.0054μs 43.4681 KOps/s 43.3955 KOps/s $\color{#35bf28}+0.17\%$
test_step_mdp_speed[False-False-True-False-False] 0.2251ms 14.9179μs 67.0334 KOps/s 66.4515 KOps/s $\color{#35bf28}+0.88\%$
test_step_mdp_speed[False-False-False-True-True] 0.1891ms 38.8786μs 25.7211 KOps/s 25.4837 KOps/s $\color{#35bf28}+0.93\%$
test_step_mdp_speed[False-False-False-True-False] 64.4200μs 26.2448μs 38.1028 KOps/s 37.1110 KOps/s $\color{#35bf28}+2.67\%$
test_step_mdp_speed[False-False-False-False-True] 0.5996ms 24.7775μs 40.3592 KOps/s 39.9231 KOps/s $\color{#35bf28}+1.09\%$
test_step_mdp_speed[False-False-False-False-False] 54.7920μs 16.5443μs 60.4439 KOps/s 59.4701 KOps/s $\color{#35bf28}+1.64\%$
test_values[generalized_advantage_estimate-True-True] 11.7169ms 9.7091ms 102.9960 Ops/s 99.1395 Ops/s $\color{#35bf28}+3.89\%$
test_values[vec_generalized_advantage_estimate-True-True] 26.1420ms 24.1510ms 41.4062 Ops/s 41.4314 Ops/s $\color{#d91a1a}-0.06\%$
test_values[td0_return_estimate-False-False] 0.2376ms 0.1807ms 5.5327 KOps/s 5.6015 KOps/s $\color{#d91a1a}-1.23\%$
test_values[td1_return_estimate-False-False] 24.5168ms 23.7819ms 42.0488 Ops/s 41.0858 Ops/s $\color{#35bf28}+2.34\%$
test_values[vec_td1_return_estimate-False-False] 27.1422ms 24.3556ms 41.0583 Ops/s 41.0730 Ops/s $\color{#d91a1a}-0.04\%$
test_values[td_lambda_return_estimate-True-False] 36.5843ms 34.3522ms 29.1102 Ops/s 28.7496 Ops/s $\color{#35bf28}+1.25\%$
test_values[vec_td_lambda_return_estimate-True-False] 26.0367ms 24.1875ms 41.3437 Ops/s 41.0905 Ops/s $\color{#35bf28}+0.62\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.5260ms 8.2886ms 120.6476 Ops/s 116.8496 Ops/s $\color{#35bf28}+3.25\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 3.6355ms 1.7867ms 559.7001 Ops/s 514.1188 Ops/s $\textbf{\color{#35bf28}+8.87\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4651ms 0.3574ms 2.7982 KOps/s 2.6731 KOps/s $\color{#35bf28}+4.68\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 43.9224ms 42.6397ms 23.4523 Ops/s 24.1964 Ops/s $\color{#d91a1a}-3.08\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.2648ms 3.4167ms 292.6803 Ops/s 288.8489 Ops/s $\color{#35bf28}+1.33\%$
test_dqn_speed[False-None] 1.9165ms 1.3951ms 716.7755 Ops/s 698.4939 Ops/s $\color{#35bf28}+2.62\%$
test_dqn_speed[False-backward] 1.9777ms 1.9007ms 526.1193 Ops/s 503.8316 Ops/s $\color{#35bf28}+4.42\%$
test_dqn_speed[True-None] 0.7291ms 0.4778ms 2.0928 KOps/s 2.0023 KOps/s $\color{#35bf28}+4.52\%$
test_dqn_speed[True-backward] 0.9356ms 0.8929ms 1.1199 KOps/s 830.9177 Ops/s $\textbf{\color{#35bf28}+34.78\%}$
test_dqn_speed[reduce-overhead-None] 0.6453ms 0.4752ms 2.1045 KOps/s 2.0183 KOps/s $\color{#35bf28}+4.28\%$
test_dqn_speed[reduce-overhead-backward] 0.9298ms 0.8915ms 1.1217 KOps/s 1.0349 KOps/s $\textbf{\color{#35bf28}+8.39\%}$
test_ddpg_speed[False-None] 3.2292ms 2.8882ms 346.2402 Ops/s 340.3107 Ops/s $\color{#35bf28}+1.74\%$
test_ddpg_speed[False-backward] 4.3300ms 4.0339ms 247.8993 Ops/s 241.7019 Ops/s $\color{#35bf28}+2.56\%$
test_ddpg_speed[True-None] 1.5435ms 1.2150ms 823.0322 Ops/s 794.2631 Ops/s $\color{#35bf28}+3.62\%$
test_ddpg_speed[True-backward] 2.1745ms 2.0950ms 477.3179 Ops/s 400.6880 Ops/s $\textbf{\color{#35bf28}+19.12\%}$
test_ddpg_speed[reduce-overhead-None] 1.9204ms 1.2157ms 822.5720 Ops/s 775.5283 Ops/s $\textbf{\color{#35bf28}+6.07\%}$
test_ddpg_speed[reduce-overhead-backward] 2.1466ms 2.0903ms 478.4057 Ops/s 456.1890 Ops/s $\color{#35bf28}+4.87\%$
test_sac_speed[False-None] 8.3279ms 7.9558ms 125.6946 Ops/s 122.1588 Ops/s $\color{#35bf28}+2.89\%$
test_sac_speed[False-backward] 12.4651ms 10.8074ms 92.5290 Ops/s 90.8825 Ops/s $\color{#35bf28}+1.81\%$
test_sac_speed[True-None] 2.4140ms 2.0637ms 484.5665 Ops/s 471.5867 Ops/s $\color{#35bf28}+2.75\%$
test_sac_speed[True-backward] 3.9868ms 3.7361ms 267.6622 Ops/s 261.1873 Ops/s $\color{#35bf28}+2.48\%$
test_sac_speed[reduce-overhead-None] 2.4632ms 2.0659ms 484.0447 Ops/s 472.7076 Ops/s $\color{#35bf28}+2.40\%$
test_sac_speed[reduce-overhead-backward] 3.8249ms 3.7241ms 268.5191 Ops/s 261.9923 Ops/s $\color{#35bf28}+2.49\%$
test_redq_speed[False-None] 13.6463ms 12.8230ms 77.9851 Ops/s 76.9247 Ops/s $\color{#35bf28}+1.38\%$
test_redq_speed[False-backward] 23.7997ms 22.0615ms 45.3279 Ops/s 44.4636 Ops/s $\color{#35bf28}+1.94\%$
test_redq_speed[True-None] 5.5534ms 4.8068ms 208.0395 Ops/s 202.9860 Ops/s $\color{#35bf28}+2.49\%$
test_redq_speed[True-backward] 13.8749ms 12.2918ms 81.3553 Ops/s 80.0048 Ops/s $\color{#35bf28}+1.69\%$
test_redq_speed[reduce-overhead-None] 5.7973ms 4.8115ms 207.8336 Ops/s 208.3236 Ops/s $\color{#d91a1a}-0.24\%$
test_redq_speed[reduce-overhead-backward] 14.5690ms 12.6534ms 79.0304 Ops/s 76.1614 Ops/s $\color{#35bf28}+3.77\%$
test_redq_deprec_speed[False-None] 14.8654ms 12.5704ms 79.5522 Ops/s 76.6292 Ops/s $\color{#35bf28}+3.81\%$
test_redq_deprec_speed[False-backward] 20.0676ms 18.3179ms 54.5915 Ops/s 53.0778 Ops/s $\color{#35bf28}+2.85\%$
test_redq_deprec_speed[True-None] 6.0960ms 3.8107ms 262.4182 Ops/s 255.4546 Ops/s $\color{#35bf28}+2.73\%$
test_redq_deprec_speed[True-backward] 9.0407ms 8.3834ms 119.2835 Ops/s 114.1360 Ops/s $\color{#35bf28}+4.51\%$
test_redq_deprec_speed[reduce-overhead-None] 4.4095ms 3.8071ms 262.6697 Ops/s 252.4065 Ops/s $\color{#35bf28}+4.07\%$
test_redq_deprec_speed[reduce-overhead-backward] 8.9587ms 8.1391ms 122.8633 Ops/s 119.1489 Ops/s $\color{#35bf28}+3.12\%$
test_td3_speed[False-None] 9.5419ms 7.9540ms 125.7234 Ops/s 121.1564 Ops/s $\color{#35bf28}+3.77\%$
test_td3_speed[False-backward] 10.7921ms 10.3335ms 96.7727 Ops/s 92.2754 Ops/s $\color{#35bf28}+4.87\%$
test_td3_speed[True-None] 1.8792ms 1.7574ms 569.0124 Ops/s 544.2488 Ops/s $\color{#35bf28}+4.55\%$
test_td3_speed[True-backward] 3.4516ms 3.3424ms 299.1848 Ops/s 279.0565 Ops/s $\textbf{\color{#35bf28}+7.21\%}$
test_td3_speed[reduce-overhead-None] 1.8629ms 1.7472ms 572.3387 Ops/s 543.2105 Ops/s $\textbf{\color{#35bf28}+5.36\%}$
test_td3_speed[reduce-overhead-backward] 4.1236ms 3.3751ms 296.2889 Ops/s 288.6666 Ops/s $\color{#35bf28}+2.64\%$
test_cql_speed[False-None] 39.5914ms 36.5832ms 27.3349 Ops/s 27.1884 Ops/s $\color{#35bf28}+0.54\%$
test_cql_speed[False-backward] 47.1335ms 45.7901ms 21.8388 Ops/s 20.9917 Ops/s $\color{#35bf28}+4.04\%$
test_cql_speed[True-None] 17.2627ms 15.8777ms 62.9814 Ops/s 62.0674 Ops/s $\color{#35bf28}+1.47\%$
test_cql_speed[True-backward] 23.3352ms 22.7375ms 43.9802 Ops/s 43.2844 Ops/s $\color{#35bf28}+1.61\%$
test_cql_speed[reduce-overhead-None] 17.1142ms 16.0871ms 62.1615 Ops/s 61.6096 Ops/s $\color{#35bf28}+0.90\%$
test_cql_speed[reduce-overhead-backward] 24.3706ms 23.1383ms 43.2184 Ops/s 43.0063 Ops/s $\color{#35bf28}+0.49\%$
test_a2c_speed[False-None] 8.0580ms 7.1305ms 140.2417 Ops/s 135.7557 Ops/s $\color{#35bf28}+3.30\%$
test_a2c_speed[False-backward] 15.0655ms 14.2445ms 70.2027 Ops/s 69.3051 Ops/s $\color{#35bf28}+1.30\%$
test_a2c_speed[True-None] 4.0450ms 3.6496ms 274.0008 Ops/s 259.7354 Ops/s $\textbf{\color{#35bf28}+5.49\%}$
test_a2c_speed[True-backward] 10.5106ms 10.1057ms 98.9538 Ops/s 97.0754 Ops/s $\color{#35bf28}+1.94\%$
test_a2c_speed[reduce-overhead-None] 4.0505ms 3.6533ms 273.7243 Ops/s 265.1551 Ops/s $\color{#35bf28}+3.23\%$
test_a2c_speed[reduce-overhead-backward] 10.9001ms 10.3193ms 96.9062 Ops/s 96.1463 Ops/s $\color{#35bf28}+0.79\%$
test_ppo_speed[False-None] 9.0001ms 7.4308ms 134.5759 Ops/s 131.3987 Ops/s $\color{#35bf28}+2.42\%$
test_ppo_speed[False-backward] 15.7500ms 14.6472ms 68.2725 Ops/s 66.1037 Ops/s $\color{#35bf28}+3.28\%$
test_ppo_speed[True-None] 4.8266ms 4.0438ms 247.2946 Ops/s 240.2979 Ops/s $\color{#35bf28}+2.91\%$
test_ppo_speed[True-backward] 12.3657ms 10.3257ms 96.8457 Ops/s 99.0864 Ops/s $\color{#d91a1a}-2.26\%$
test_ppo_speed[reduce-overhead-None] 4.4640ms 4.0386ms 247.6077 Ops/s 243.8481 Ops/s $\color{#35bf28}+1.54\%$
test_ppo_speed[reduce-overhead-backward] 10.2808ms 9.9446ms 100.5572 Ops/s 98.7774 Ops/s $\color{#35bf28}+1.80\%$
test_reinforce_speed[False-None] 7.4256ms 6.5097ms 153.6164 Ops/s 149.6307 Ops/s $\color{#35bf28}+2.66\%$
test_reinforce_speed[False-backward] 11.3177ms 9.7633ms 102.4242 Ops/s 100.9192 Ops/s $\color{#35bf28}+1.49\%$
test_reinforce_speed[True-None] 3.8785ms 3.0164ms 331.5167 Ops/s 322.1971 Ops/s $\color{#35bf28}+2.89\%$
test_reinforce_speed[True-backward] 9.5710ms 9.0309ms 110.7309 Ops/s 110.3272 Ops/s $\color{#35bf28}+0.37\%$
test_reinforce_speed[reduce-overhead-None] 3.3644ms 3.0160ms 331.5694 Ops/s 321.7610 Ops/s $\color{#35bf28}+3.05\%$
test_reinforce_speed[reduce-overhead-backward] 9.5110ms 9.0667ms 110.2934 Ops/s 109.2052 Ops/s $\color{#35bf28}+1.00\%$
test_iql_speed[False-None] 33.6480ms 32.1840ms 31.0713 Ops/s 30.0109 Ops/s $\color{#35bf28}+3.53\%$
test_iql_speed[False-backward] 47.2554ms 45.3167ms 22.0669 Ops/s 21.5844 Ops/s $\color{#35bf28}+2.24\%$
test_iql_speed[True-None] 11.9615ms 11.0578ms 90.4335 Ops/s 88.1389 Ops/s $\color{#35bf28}+2.60\%$
test_iql_speed[True-backward] 26.6551ms 22.2969ms 44.8493 Ops/s 43.6829 Ops/s $\color{#35bf28}+2.67\%$
test_iql_speed[reduce-overhead-None] 12.4862ms 11.1689ms 89.5346 Ops/s 87.1793 Ops/s $\color{#35bf28}+2.70\%$
test_iql_speed[reduce-overhead-backward] 23.4801ms 22.3000ms 44.8431 Ops/s 44.8130 Ops/s $\color{#35bf28}+0.07\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.4833ms 4.8224ms 207.3670 Ops/s 204.5229 Ops/s $\color{#35bf28}+1.39\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7268ms 0.5065ms 1.9745 KOps/s 1.9255 KOps/s $\color{#35bf28}+2.54\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8145ms 0.4871ms 2.0528 KOps/s 2.0083 KOps/s $\color{#35bf28}+2.22\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.1262ms 4.5765ms 218.5089 Ops/s 212.5965 Ops/s $\color{#35bf28}+2.78\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.4538ms 0.5022ms 1.9911 KOps/s 1.9586 KOps/s $\color{#35bf28}+1.66\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 1.3448ms 0.4869ms 2.0540 KOps/s 2.0704 KOps/s $\color{#d91a1a}-0.80\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.9065ms 1.6256ms 615.1599 Ops/s 597.2508 Ops/s $\color{#35bf28}+3.00\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.3157ms 1.5501ms 645.1055 Ops/s 631.4891 Ops/s $\color{#35bf28}+2.16\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.0605ms 4.7509ms 210.4844 Ops/s 207.0203 Ops/s $\color{#35bf28}+1.67\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0439ms 0.6454ms 1.5495 KOps/s 1.4984 KOps/s $\color{#35bf28}+3.41\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9761ms 0.6262ms 1.5969 KOps/s 1.5825 KOps/s $\color{#35bf28}+0.91\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.1698ms 4.6983ms 212.8441 Ops/s 214.2304 Ops/s $\color{#d91a1a}-0.65\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.8796ms 0.5165ms 1.9359 KOps/s 1.8840 KOps/s $\color{#35bf28}+2.76\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8513ms 0.5013ms 1.9948 KOps/s 2.0062 KOps/s $\color{#d91a1a}-0.57\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.8200ms 4.5374ms 220.3883 Ops/s 215.7838 Ops/s $\color{#35bf28}+2.13\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.5546ms 0.5033ms 1.9868 KOps/s 1.9703 KOps/s $\color{#35bf28}+0.84\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8852ms 0.5278ms 1.8946 KOps/s 2.0030 KOps/s $\textbf{\color{#d91a1a}-5.41\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.1687ms 4.7101ms 212.3101 Ops/s 211.2691 Ops/s $\color{#35bf28}+0.49\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.3666ms 0.6412ms 1.5595 KOps/s 1.5042 KOps/s $\color{#35bf28}+3.68\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0093ms 0.6234ms 1.6041 KOps/s 1.5564 KOps/s $\color{#35bf28}+3.07\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.0647ms 4.2429ms 235.6889 Ops/s 242.2353 Ops/s $\color{#d91a1a}-2.70\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 7.0895ms 2.2799ms 438.6126 Ops/s 433.6568 Ops/s $\color{#35bf28}+1.14\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 5.4390ms 1.3102ms 763.2527 Ops/s 748.1930 Ops/s $\color{#35bf28}+2.01\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 5.3849ms 4.2135ms 237.3328 Ops/s 34.3246 Ops/s $\textbf{\color{#35bf28}+591.44\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 6.7454ms 2.2711ms 440.3157 Ops/s 370.2112 Ops/s $\textbf{\color{#35bf28}+18.94\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.6371ms 1.1466ms 872.1320 Ops/s 713.5305 Ops/s $\textbf{\color{#35bf28}+22.23\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4138s 12.6576ms 79.0041 Ops/s 219.5505 Ops/s $\textbf{\color{#d91a1a}-64.02\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.0270ms 2.4691ms 405.0090 Ops/s 381.7472 Ops/s $\textbf{\color{#35bf28}+6.09\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.0659ms 1.4205ms 703.9530 Ops/s 657.1504 Ops/s $\textbf{\color{#35bf28}+7.12\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 11.5431ms 11.1905ms 89.3615 Ops/s 80.1816 Ops/s $\textbf{\color{#35bf28}+11.45\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.0643ms 13.8630ms 72.1344 Ops/s 66.9868 Ops/s $\textbf{\color{#35bf28}+7.68\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.2160ms 20.0493ms 49.8770 Ops/s 46.9961 Ops/s $\textbf{\color{#35bf28}+6.13\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.1227ms 14.1288ms 70.7774 Ops/s 67.6391 Ops/s $\color{#35bf28}+4.64\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 20.6463ms 20.0134ms 49.9666 Ops/s 47.7237 Ops/s $\color{#35bf28}+4.70\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 16.4962ms 15.3055ms 65.3358 Ops/s 61.4233 Ops/s $\textbf{\color{#35bf28}+6.37\%}$

Copy link

github-actions bot commented Feb 3, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}18$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.8313s 0.7480s 1.3369 Ops/s 1.3669 Ops/s $\color{#d91a1a}-2.20\%$
test_transformed 1.3825s 1.2947s 0.7724 Ops/s 0.7621 Ops/s $\color{#35bf28}+1.35\%$
test_serial 2.1508s 2.1459s 0.4660 Ops/s 0.4585 Ops/s $\color{#35bf28}+1.64\%$
test_parallel 1.8696s 1.8203s 0.5493 Ops/s 0.5322 Ops/s $\color{#35bf28}+3.22\%$
test_step_mdp_speed[True-True-True-True-True] 0.2158ms 40.8513μs 24.4790 KOps/s 24.6317 KOps/s $\color{#d91a1a}-0.62\%$
test_step_mdp_speed[True-True-True-True-False] 51.9210μs 23.5879μs 42.3946 KOps/s 42.1282 KOps/s $\color{#35bf28}+0.63\%$
test_step_mdp_speed[True-True-True-False-True] 52.8410μs 22.6484μs 44.1532 KOps/s 45.2601 KOps/s $\color{#d91a1a}-2.45\%$
test_step_mdp_speed[True-True-True-False-False] 37.6000μs 12.8871μs 77.5968 KOps/s 74.2533 KOps/s $\color{#35bf28}+4.50\%$
test_step_mdp_speed[True-True-False-True-True] 72.0120μs 42.8286μs 23.3489 KOps/s 23.5745 KOps/s $\color{#d91a1a}-0.96\%$
test_step_mdp_speed[True-True-False-True-False] 61.9010μs 25.6369μs 39.0062 KOps/s 38.5056 KOps/s $\color{#35bf28}+1.30\%$
test_step_mdp_speed[True-True-False-False-True] 0.1402ms 24.7162μs 40.4593 KOps/s 40.2722 KOps/s $\color{#35bf28}+0.46\%$
test_step_mdp_speed[True-True-False-False-False] 53.6510μs 15.4834μs 64.5854 KOps/s 64.7959 KOps/s $\color{#d91a1a}-0.32\%$
test_step_mdp_speed[True-False-True-True-True] 78.4920μs 46.0662μs 21.7079 KOps/s 22.2276 KOps/s $\color{#d91a1a}-2.34\%$
test_step_mdp_speed[True-False-True-True-False] 54.3400μs 28.3842μs 35.2308 KOps/s 35.4610 KOps/s $\color{#d91a1a}-0.65\%$
test_step_mdp_speed[True-False-True-False-True] 55.8510μs 24.4754μs 40.8573 KOps/s 40.8205 KOps/s $\color{#35bf28}+0.09\%$
test_step_mdp_speed[True-False-True-False-False] 48.4810μs 15.3364μs 65.2043 KOps/s 64.9022 KOps/s $\color{#35bf28}+0.47\%$
test_step_mdp_speed[True-False-False-True-True] 0.1012ms 47.0611μs 21.2490 KOps/s 21.6194 KOps/s $\color{#d91a1a}-1.71\%$
test_step_mdp_speed[True-False-False-True-False] 67.8910μs 29.6858μs 33.6862 KOps/s 32.6961 KOps/s $\color{#35bf28}+3.03\%$
test_step_mdp_speed[True-False-False-False-True] 53.9210μs 27.1369μs 36.8502 KOps/s 36.0909 KOps/s $\color{#35bf28}+2.10\%$
test_step_mdp_speed[True-False-False-False-False] 46.9000μs 17.6242μs 56.7403 KOps/s 56.7305 KOps/s $\color{#35bf28}+0.02\%$
test_step_mdp_speed[False-True-True-True-True] 73.3110μs 45.1208μs 22.1627 KOps/s 22.3314 KOps/s $\color{#d91a1a}-0.76\%$
test_step_mdp_speed[False-True-True-True-False] 65.8710μs 27.4348μs 36.4501 KOps/s 35.1379 KOps/s $\color{#35bf28}+3.73\%$
test_step_mdp_speed[False-True-True-False-True] 2.7716ms 28.4458μs 35.1545 KOps/s 35.1137 KOps/s $\color{#35bf28}+0.12\%$
test_step_mdp_speed[False-True-True-False-False] 59.3110μs 16.8970μs 59.1820 KOps/s 60.7344 KOps/s $\color{#d91a1a}-2.56\%$
test_step_mdp_speed[False-True-False-True-True] 0.1050ms 46.6542μs 21.4343 KOps/s 21.4737 KOps/s $\color{#d91a1a}-0.18\%$
test_step_mdp_speed[False-True-False-True-False] 66.8310μs 30.5545μs 32.7284 KOps/s 32.7221 KOps/s $\color{#35bf28}+0.02\%$
test_step_mdp_speed[False-True-False-False-True] 60.4710μs 30.8185μs 32.4480 KOps/s 31.9805 KOps/s $\color{#35bf28}+1.46\%$
test_step_mdp_speed[False-True-False-False-False] 45.9200μs 19.5345μs 51.1916 KOps/s 51.6054 KOps/s $\color{#d91a1a}-0.80\%$
test_step_mdp_speed[False-False-True-True-True] 95.8120μs 50.2864μs 19.8861 KOps/s 20.2501 KOps/s $\color{#d91a1a}-1.80\%$
test_step_mdp_speed[False-False-True-True-False] 69.3710μs 33.3610μs 29.9751 KOps/s 30.0859 KOps/s $\color{#d91a1a}-0.37\%$
test_step_mdp_speed[False-False-True-False-True] 65.2010μs 30.4548μs 32.8355 KOps/s 32.5578 KOps/s $\color{#35bf28}+0.85\%$
test_step_mdp_speed[False-False-True-False-False] 44.4110μs 19.3971μs 51.5540 KOps/s 52.0613 KOps/s $\color{#d91a1a}-0.97\%$
test_step_mdp_speed[False-False-False-True-True] 83.7720μs 51.4386μs 19.4407 KOps/s 19.5312 KOps/s $\color{#d91a1a}-0.46\%$
test_step_mdp_speed[False-False-False-True-False] 70.2010μs 35.4118μs 28.2391 KOps/s 28.2965 KOps/s $\color{#d91a1a}-0.20\%$
test_step_mdp_speed[False-False-False-False-True] 62.8810μs 32.7256μs 30.5571 KOps/s 29.8374 KOps/s $\color{#35bf28}+2.41\%$
test_step_mdp_speed[False-False-False-False-False] 48.2710μs 21.7581μs 45.9598 KOps/s 45.7265 KOps/s $\color{#35bf28}+0.51\%$
test_values[generalized_advantage_estimate-True-True] 25.9608ms 25.3776ms 39.4048 Ops/s 38.4132 Ops/s $\color{#35bf28}+2.58\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1194s 3.2904ms 303.9176 Ops/s 306.9582 Ops/s $\color{#d91a1a}-0.99\%$
test_values[td0_return_estimate-False-False] 0.1093ms 80.3949μs 12.4386 KOps/s 12.0756 KOps/s $\color{#35bf28}+3.01\%$
test_values[td1_return_estimate-False-False] 58.5333ms 57.5459ms 17.3774 Ops/s 17.1088 Ops/s $\color{#35bf28}+1.57\%$
test_values[vec_td1_return_estimate-False-False] 1.3560ms 1.0937ms 914.3564 Ops/s 908.8864 Ops/s $\color{#35bf28}+0.60\%$
test_values[td_lambda_return_estimate-True-False] 92.0695ms 89.3434ms 11.1928 Ops/s 10.9705 Ops/s $\color{#35bf28}+2.03\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.4413ms 1.0974ms 911.2411 Ops/s 914.3553 Ops/s $\color{#d91a1a}-0.34\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.5585ms 25.2365ms 39.6252 Ops/s 38.7987 Ops/s $\color{#35bf28}+2.13\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0328ms 0.7659ms 1.3056 KOps/s 1.2908 KOps/s $\color{#35bf28}+1.15\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7804ms 0.6787ms 1.4734 KOps/s 1.4600 KOps/s $\color{#35bf28}+0.92\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5427ms 1.4961ms 668.4108 Ops/s 665.2179 Ops/s $\color{#35bf28}+0.48\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7403ms 0.6948ms 1.4392 KOps/s 1.4241 KOps/s $\color{#35bf28}+1.06\%$
test_dqn_speed[False-None] 1.5679ms 1.4901ms 671.0787 Ops/s 654.4733 Ops/s $\color{#35bf28}+2.54\%$
test_dqn_speed[False-backward] 2.1507ms 2.0936ms 477.6460 Ops/s 466.2362 Ops/s $\color{#35bf28}+2.45\%$
test_dqn_speed[True-None] 0.6950ms 0.5548ms 1.8024 KOps/s 1.7389 KOps/s $\color{#35bf28}+3.66\%$
test_dqn_speed[True-backward] 1.2766ms 1.1235ms 890.0777 Ops/s 871.3882 Ops/s $\color{#35bf28}+2.14\%$
test_dqn_speed[reduce-overhead-None] 0.7215ms 0.5863ms 1.7056 KOps/s 1.7312 KOps/s $\color{#d91a1a}-1.48\%$
test_dqn_speed[reduce-overhead-backward] 0.9985ms 0.9629ms 1.0386 KOps/s 997.2581 Ops/s $\color{#35bf28}+4.14\%$
test_ddpg_speed[False-None] 3.4161ms 2.8722ms 348.1699 Ops/s 347.8137 Ops/s $\color{#35bf28}+0.10\%$
test_ddpg_speed[False-backward] 4.5960ms 4.1236ms 242.5087 Ops/s 239.3410 Ops/s $\color{#35bf28}+1.32\%$
test_ddpg_speed[True-None] 1.4241ms 1.3342ms 749.4868 Ops/s 733.2068 Ops/s $\color{#35bf28}+2.22\%$
test_ddpg_speed[True-backward] 2.4817ms 2.4355ms 410.6006 Ops/s 400.2167 Ops/s $\color{#35bf28}+2.59\%$
test_ddpg_speed[reduce-overhead-None] 1.4332ms 1.3493ms 741.1068 Ops/s 725.1153 Ops/s $\color{#35bf28}+2.21\%$
test_ddpg_speed[reduce-overhead-backward] 1.9273ms 1.8862ms 530.1650 Ops/s 520.4545 Ops/s $\color{#35bf28}+1.87\%$
test_sac_speed[False-None] 8.3530ms 7.9217ms 126.2352 Ops/s 123.0238 Ops/s $\color{#35bf28}+2.61\%$
test_sac_speed[False-backward] 11.3483ms 10.8199ms 92.4226 Ops/s 89.8443 Ops/s $\color{#35bf28}+2.87\%$
test_sac_speed[True-None] 1.9297ms 1.8354ms 544.8460 Ops/s 529.2542 Ops/s $\color{#35bf28}+2.95\%$
test_sac_speed[True-backward] 3.6352ms 3.5299ms 283.2924 Ops/s 267.1342 Ops/s $\textbf{\color{#35bf28}+6.05\%}$
test_sac_speed[reduce-overhead-None] 22.2982ms 12.0928ms 82.6940 Ops/s 81.9744 Ops/s $\color{#35bf28}+0.88\%$
test_sac_speed[reduce-overhead-backward] 1.6861ms 1.6213ms 616.7710 Ops/s 595.9398 Ops/s $\color{#35bf28}+3.50\%$
test_redq_speed[False-None] 7.8275ms 7.4350ms 134.4984 Ops/s 129.2854 Ops/s $\color{#35bf28}+4.03\%$
test_redq_speed[False-backward] 11.8315ms 11.2907ms 88.5687 Ops/s 84.8067 Ops/s $\color{#35bf28}+4.44\%$
test_redq_speed[True-None] 2.4139ms 2.2968ms 435.3827 Ops/s 420.0668 Ops/s $\color{#35bf28}+3.65\%$
test_redq_speed[True-backward] 4.1523ms 4.0307ms 248.0977 Ops/s 240.9656 Ops/s $\color{#35bf28}+2.96\%$
test_redq_speed[reduce-overhead-None] 2.3912ms 2.3172ms 431.5577 Ops/s 418.0719 Ops/s $\color{#35bf28}+3.23\%$
test_redq_speed[reduce-overhead-backward] 4.5284ms 4.0785ms 245.1863 Ops/s 237.4742 Ops/s $\color{#35bf28}+3.25\%$
test_redq_deprec_speed[False-None] 9.3338ms 9.0054ms 111.0443 Ops/s 108.4883 Ops/s $\color{#35bf28}+2.36\%$
test_redq_deprec_speed[False-backward] 12.5253ms 12.0143ms 83.2339 Ops/s 81.8270 Ops/s $\color{#35bf28}+1.72\%$
test_redq_deprec_speed[True-None] 2.8831ms 2.6459ms 377.9382 Ops/s 362.9744 Ops/s $\color{#35bf28}+4.12\%$
test_redq_deprec_speed[True-backward] 4.7914ms 4.4166ms 226.4178 Ops/s 215.7881 Ops/s $\color{#35bf28}+4.93\%$
test_redq_deprec_speed[reduce-overhead-None] 2.7194ms 2.6294ms 380.3118 Ops/s 368.3892 Ops/s $\color{#35bf28}+3.24\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.8898ms 4.4305ms 225.7091 Ops/s 215.2887 Ops/s $\color{#35bf28}+4.84\%$
test_td3_speed[False-None] 8.3466ms 8.2136ms 121.7492 Ops/s 123.9633 Ops/s $\color{#d91a1a}-1.79\%$
test_td3_speed[False-backward] 11.1691ms 10.5518ms 94.7708 Ops/s 93.7880 Ops/s $\color{#35bf28}+1.05\%$
test_td3_speed[True-None] 1.7078ms 1.6488ms 606.4856 Ops/s 558.9998 Ops/s $\textbf{\color{#35bf28}+8.49\%}$
test_td3_speed[True-backward] 3.2979ms 3.2061ms 311.9082 Ops/s 289.1305 Ops/s $\textbf{\color{#35bf28}+7.88\%}$
test_td3_speed[reduce-overhead-None] 54.9399ms 26.5453ms 37.6715 Ops/s 36.0149 Ops/s $\color{#35bf28}+4.60\%$
test_td3_speed[reduce-overhead-backward] 1.3979ms 1.3497ms 740.9109 Ops/s 648.3019 Ops/s $\textbf{\color{#35bf28}+14.28\%}$
test_cql_speed[False-None] 17.0544ms 16.5833ms 60.3017 Ops/s 58.9568 Ops/s $\color{#35bf28}+2.28\%$
test_cql_speed[False-backward] 22.2732ms 21.8098ms 45.8509 Ops/s 44.3979 Ops/s $\color{#35bf28}+3.27\%$
test_cql_speed[True-None] 3.3103ms 3.2566ms 307.0720 Ops/s 297.5984 Ops/s $\color{#35bf28}+3.18\%$
test_cql_speed[True-backward] 5.8911ms 5.7068ms 175.2300 Ops/s 175.1730 Ops/s $\color{#35bf28}+0.03\%$
test_cql_speed[reduce-overhead-None] 20.8661ms 13.0775ms 76.4674 Ops/s 57.2035 Ops/s $\textbf{\color{#35bf28}+33.68\%}$
test_cql_speed[reduce-overhead-backward] 1.9993ms 1.8353ms 544.8633 Ops/s 527.6312 Ops/s $\color{#35bf28}+3.27\%$
test_a2c_speed[False-None] 3.3174ms 3.1516ms 317.2990 Ops/s 301.2256 Ops/s $\textbf{\color{#35bf28}+5.34\%}$
test_a2c_speed[False-backward] 6.6840ms 6.0793ms 164.4934 Ops/s 159.8600 Ops/s $\color{#35bf28}+2.90\%$
test_a2c_speed[True-None] 1.5165ms 1.3486ms 741.5163 Ops/s 731.0600 Ops/s $\color{#35bf28}+1.43\%$
test_a2c_speed[True-backward] 2.9507ms 2.8775ms 347.5265 Ops/s 332.2948 Ops/s $\color{#35bf28}+4.58\%$
test_a2c_speed[reduce-overhead-None] 16.1288ms 9.1548ms 109.2327 Ops/s 111.0715 Ops/s $\color{#d91a1a}-1.66\%$
test_a2c_speed[reduce-overhead-backward] 1.5387ms 1.4554ms 687.1017 Ops/s 669.8356 Ops/s $\color{#35bf28}+2.58\%$
test_ppo_speed[False-None] 3.9017ms 3.6720ms 272.3306 Ops/s 260.5119 Ops/s $\color{#35bf28}+4.54\%$
test_ppo_speed[False-backward] 7.2726ms 6.8127ms 146.7838 Ops/s 141.7420 Ops/s $\color{#35bf28}+3.56\%$
test_ppo_speed[True-None] 1.5753ms 1.4041ms 712.1880 Ops/s 693.5880 Ops/s $\color{#35bf28}+2.68\%$
test_ppo_speed[True-backward] 3.2568ms 3.1960ms 312.8886 Ops/s 301.6015 Ops/s $\color{#35bf28}+3.74\%$
test_ppo_speed[reduce-overhead-None] 1.0484ms 0.9666ms 1.0346 KOps/s 1.0253 KOps/s $\color{#35bf28}+0.90\%$
test_ppo_speed[reduce-overhead-backward] 1.6090ms 1.5574ms 642.0778 Ops/s 679.8477 Ops/s $\textbf{\color{#d91a1a}-5.56\%}$
test_reinforce_speed[False-None] 2.3777ms 2.2496ms 444.5248 Ops/s 434.2108 Ops/s $\color{#35bf28}+2.38\%$
test_reinforce_speed[False-backward] 3.8283ms 3.4109ms 293.1778 Ops/s 297.8737 Ops/s $\color{#d91a1a}-1.58\%$
test_reinforce_speed[True-None] 1.5178ms 1.2889ms 775.8628 Ops/s 751.1774 Ops/s $\color{#35bf28}+3.29\%$
test_reinforce_speed[True-backward] 3.0503ms 2.9460ms 339.4452 Ops/s 335.1255 Ops/s $\color{#35bf28}+1.29\%$
test_reinforce_speed[reduce-overhead-None] 18.6274ms 10.1370ms 98.6483 Ops/s 100.3807 Ops/s $\color{#d91a1a}-1.73\%$
test_reinforce_speed[reduce-overhead-backward] 1.5500ms 1.4729ms 678.9275 Ops/s 654.0134 Ops/s $\color{#35bf28}+3.81\%$
test_iql_speed[False-None] 9.6092ms 9.1349ms 109.4709 Ops/s 106.2904 Ops/s $\color{#35bf28}+2.99\%$
test_iql_speed[False-backward] 13.4284ms 12.8274ms 77.9583 Ops/s 75.7448 Ops/s $\color{#35bf28}+2.92\%$
test_iql_speed[True-None] 2.4140ms 2.2452ms 445.3857 Ops/s 425.3412 Ops/s $\color{#35bf28}+4.71\%$
test_iql_speed[True-backward] 5.3573ms 4.9124ms 203.5655 Ops/s 196.4033 Ops/s $\color{#35bf28}+3.65\%$
test_iql_speed[reduce-overhead-None] 19.0835ms 11.2918ms 88.5597 Ops/s 89.7635 Ops/s $\color{#d91a1a}-1.34\%$
test_iql_speed[reduce-overhead-backward] 2.1470ms 2.0646ms 484.3582 Ops/s 495.3969 Ops/s $\color{#d91a1a}-2.23\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.9984ms 6.3467ms 157.5616 Ops/s 154.9988 Ops/s $\color{#35bf28}+1.65\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5194ms 0.2635ms 3.7950 KOps/s 2.7855 KOps/s $\textbf{\color{#35bf28}+36.24\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6545ms 0.2545ms 3.9287 KOps/s 2.9084 KOps/s $\textbf{\color{#35bf28}+35.08\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3817ms 6.0590ms 165.0448 Ops/s 163.0012 Ops/s $\color{#35bf28}+1.25\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.0550ms 0.2855ms 3.5021 KOps/s 3.7099 KOps/s $\textbf{\color{#d91a1a}-5.60\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5701ms 0.2617ms 3.8207 KOps/s 2.9502 KOps/s $\textbf{\color{#35bf28}+29.51\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.4656ms 1.2702ms 787.2819 Ops/s 785.5400 Ops/s $\color{#35bf28}+0.22\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4282ms 1.1778ms 849.0230 Ops/s 754.4857 Ops/s $\textbf{\color{#35bf28}+12.53\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4018ms 6.2637ms 159.6505 Ops/s 157.5502 Ops/s $\color{#35bf28}+1.33\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8343ms 0.4983ms 2.0068 KOps/s 2.4463 KOps/s $\textbf{\color{#d91a1a}-17.97\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7093ms 0.4527ms 2.2092 KOps/s 2.5519 KOps/s $\textbf{\color{#d91a1a}-13.43\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2697ms 6.1367ms 162.9541 Ops/s 160.7170 Ops/s $\color{#35bf28}+1.39\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8513ms 0.2965ms 3.3725 KOps/s 3.0150 KOps/s $\textbf{\color{#35bf28}+11.86\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5491ms 0.3232ms 3.0942 KOps/s 3.2341 KOps/s $\color{#d91a1a}-4.32\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 8.6501ms 6.0278ms 165.8968 Ops/s 162.4359 Ops/s $\color{#35bf28}+2.13\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.9411ms 0.2585ms 3.8689 KOps/s 3.1076 KOps/s $\textbf{\color{#35bf28}+24.50\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4664ms 0.2394ms 4.1775 KOps/s 3.4560 KOps/s $\textbf{\color{#35bf28}+20.88\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.3074ms 6.1781ms 161.8624 Ops/s 155.9192 Ops/s $\color{#35bf28}+3.81\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.9296ms 0.4048ms 2.4702 KOps/s 2.2000 KOps/s $\textbf{\color{#35bf28}+12.28\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7629ms 0.3818ms 2.6191 KOps/s 2.3846 KOps/s $\textbf{\color{#35bf28}+9.84\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.1914ms 5.5383ms 180.5601 Ops/s 178.6314 Ops/s $\color{#35bf28}+1.08\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.4506ms 2.1081ms 474.3607 Ops/s 434.1587 Ops/s $\textbf{\color{#35bf28}+9.26\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.0747ms 1.2709ms 786.8460 Ops/s 836.5097 Ops/s $\textbf{\color{#d91a1a}-5.94\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.2975ms 5.6600ms 176.6770 Ops/s 179.3680 Ops/s $\color{#d91a1a}-1.50\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 10.1102ms 2.1457ms 466.0571 Ops/s 428.8653 Ops/s $\textbf{\color{#35bf28}+8.67\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.9649ms 1.2257ms 815.8766 Ops/s 865.1947 Ops/s $\textbf{\color{#d91a1a}-5.70\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4878s 15.4176ms 64.8611 Ops/s 31.3804 Ops/s $\textbf{\color{#35bf28}+106.69\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 6.9183ms 2.2133ms 451.8093 Ops/s 468.7738 Ops/s $\color{#d91a1a}-3.62\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 9.2510ms 1.4521ms 688.6760 Ops/s 732.4435 Ops/s $\textbf{\color{#d91a1a}-5.98\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.6970ms 13.2123ms 75.6868 Ops/s 74.3286 Ops/s $\color{#35bf28}+1.83\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.2355ms 16.7234ms 59.7966 Ops/s 58.7539 Ops/s $\color{#35bf28}+1.77\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.4723ms 17.7457ms 56.3515 Ops/s 55.3842 Ops/s $\color{#35bf28}+1.75\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 18.6106ms 16.9577ms 58.9702 Ops/s 56.7995 Ops/s $\color{#35bf28}+3.82\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.7605ms 17.5252ms 57.0607 Ops/s 54.7905 Ops/s $\color{#35bf28}+4.14\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 19.5295ms 18.4054ms 54.3320 Ops/s 52.6586 Ops/s $\color{#35bf28}+3.18\%$

[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens added bc breaking backward compatibility breaking change Deprecation labels Feb 4, 2025
@vmoens vmoens merged commit a5bee75 into gh/vmoens/89/base Feb 4, 2025
27 of 50 checks passed
vmoens added a commit that referenced this pull request Feb 4, 2025
ghstack-source-id: bd34b8e9112c4fc3a30bd095e3ac073a7d0b5469
Pull Request resolved: #2746
@vmoens vmoens deleted the gh/vmoens/89/head branch February 4, 2025 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bc breaking backward compatibility breaking change CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Deprecation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants