Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Upgrade to v0.7 #2745

Merged
merged 4 commits into from
Feb 4, 2025
Merged

[CI] Upgrade to v0.7 #2745

merged 4 commits into from
Feb 4, 2025

Conversation

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Feb 3, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2745

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 3, 2025
[ghstack-poisoned]
Copy link

github-actions bot commented Feb 3, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}34$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.5254s 0.4439s 2.2527 Ops/s 2.0946 Ops/s $\textbf{\color{#35bf28}+7.54\%}$
test_transformed 0.9760s 0.8907s 1.1228 Ops/s 1.0649 Ops/s $\textbf{\color{#35bf28}+5.44\%}$
test_serial 1.4481s 1.3672s 0.7314 Ops/s 0.7095 Ops/s $\color{#35bf28}+3.09\%$
test_parallel 1.2839s 1.2000s 0.8333 Ops/s 0.8020 Ops/s $\color{#35bf28}+3.91\%$
test_step_mdp_speed[True-True-True-True-True] 0.2331ms 29.8086μs 33.5473 KOps/s 32.3138 KOps/s $\color{#35bf28}+3.82\%$
test_step_mdp_speed[True-True-True-True-False] 50.9380μs 17.6439μs 56.6768 KOps/s 54.5078 KOps/s $\color{#35bf28}+3.98\%$
test_step_mdp_speed[True-True-True-False-True] 66.1810μs 16.7991μs 59.5271 KOps/s 57.2948 KOps/s $\color{#35bf28}+3.90\%$
test_step_mdp_speed[True-True-True-False-False] 32.8310μs 9.9810μs 100.1900 KOps/s 96.4715 KOps/s $\color{#35bf28}+3.85\%$
test_step_mdp_speed[True-True-False-True-True] 85.8600μs 31.6874μs 31.5583 KOps/s 28.9157 KOps/s $\textbf{\color{#35bf28}+9.14\%}$
test_step_mdp_speed[True-True-False-True-False] 59.6510μs 19.4688μs 51.3643 KOps/s 49.6763 KOps/s $\color{#35bf28}+3.40\%$
test_step_mdp_speed[True-True-False-False-True] 68.0080μs 18.6435μs 53.6380 KOps/s 51.7566 KOps/s $\color{#35bf28}+3.64\%$
test_step_mdp_speed[True-True-False-False-False] 30.4070μs 11.7296μs 85.2543 KOps/s 81.5029 KOps/s $\color{#35bf28}+4.60\%$
test_step_mdp_speed[True-False-True-True-True] 86.6530μs 33.7230μs 29.6533 KOps/s 28.6977 KOps/s $\color{#35bf28}+3.33\%$
test_step_mdp_speed[True-False-True-True-False] 77.2950μs 21.2165μs 47.1331 KOps/s 45.5691 KOps/s $\color{#35bf28}+3.43\%$
test_step_mdp_speed[True-False-True-False-True] 42.3300μs 18.5496μs 53.9094 KOps/s 52.1089 KOps/s $\color{#35bf28}+3.46\%$
test_step_mdp_speed[True-False-True-False-False] 73.6910μs 11.7073μs 85.4165 KOps/s 81.8709 KOps/s $\color{#35bf28}+4.33\%$
test_step_mdp_speed[True-False-False-True-True] 81.4630μs 34.9661μs 28.5991 KOps/s 27.6366 KOps/s $\color{#35bf28}+3.48\%$
test_step_mdp_speed[True-False-False-True-False] 57.8390μs 23.0503μs 43.3834 KOps/s 42.2249 KOps/s $\color{#35bf28}+2.74\%$
test_step_mdp_speed[True-False-False-False-True] 64.5780μs 20.3304μs 49.1874 KOps/s 48.0165 KOps/s $\color{#35bf28}+2.44\%$
test_step_mdp_speed[True-False-False-False-False] 37.1500μs 13.4668μs 74.2566 KOps/s 67.8143 KOps/s $\textbf{\color{#35bf28}+9.50\%}$
test_step_mdp_speed[False-True-True-True-True] 0.5668ms 33.4321μs 29.9114 KOps/s 28.9448 KOps/s $\color{#35bf28}+3.34\%$
test_step_mdp_speed[False-True-True-True-False] 75.0000μs 21.2770μs 46.9991 KOps/s 44.7254 KOps/s $\textbf{\color{#35bf28}+5.08\%}$
test_step_mdp_speed[False-True-True-False-True] 95.0580μs 21.5165μs 46.4760 KOps/s 45.3278 KOps/s $\color{#35bf28}+2.53\%$
test_step_mdp_speed[False-True-True-False-False] 61.5960μs 13.1711μs 75.9237 KOps/s 72.7917 KOps/s $\color{#35bf28}+4.30\%$
test_step_mdp_speed[False-True-False-True-True] 72.7570μs 35.0865μs 28.5010 KOps/s 27.8267 KOps/s $\color{#35bf28}+2.42\%$
test_step_mdp_speed[False-True-False-True-False] 79.5320μs 22.9140μs 43.6414 KOps/s 42.5486 KOps/s $\color{#35bf28}+2.57\%$
test_step_mdp_speed[False-True-False-False-True] 2.6331ms 23.0079μs 43.4632 KOps/s 42.0478 KOps/s $\color{#35bf28}+3.37\%$
test_step_mdp_speed[False-True-False-False-False] 44.7940μs 14.9112μs 67.0635 KOps/s 64.9472 KOps/s $\color{#35bf28}+3.26\%$
test_step_mdp_speed[False-False-True-True-True] 89.1240μs 36.8230μs 27.1570 KOps/s 26.2100 KOps/s $\color{#35bf28}+3.61\%$
test_step_mdp_speed[False-False-True-True-False] 76.6660μs 24.5448μs 40.7418 KOps/s 39.0616 KOps/s $\color{#35bf28}+4.30\%$
test_step_mdp_speed[False-False-True-False-True] 55.8350μs 23.0955μs 43.2984 KOps/s 42.1151 KOps/s $\color{#35bf28}+2.81\%$
test_step_mdp_speed[False-False-True-False-False] 40.7660μs 14.7815μs 67.6521 KOps/s 64.6575 KOps/s $\color{#35bf28}+4.63\%$
test_step_mdp_speed[False-False-False-True-True] 0.1228ms 38.3462μs 26.0782 KOps/s 24.3947 KOps/s $\textbf{\color{#35bf28}+6.90\%}$
test_step_mdp_speed[False-False-False-True-False] 74.6200μs 26.5782μs 37.6248 KOps/s 36.7921 KOps/s $\color{#35bf28}+2.26\%$
test_step_mdp_speed[False-False-False-False-True] 74.5890μs 24.7014μs 40.4835 KOps/s 39.3922 KOps/s $\color{#35bf28}+2.77\%$
test_step_mdp_speed[False-False-False-False-False] 44.2920μs 16.4572μs 60.7638 KOps/s 58.7172 KOps/s $\color{#35bf28}+3.49\%$
test_values[generalized_advantage_estimate-True-True] 11.0591ms 9.6501ms 103.6261 Ops/s 100.3304 Ops/s $\color{#35bf28}+3.28\%$
test_values[vec_generalized_advantage_estimate-True-True] 26.1179ms 23.8272ms 41.9689 Ops/s 38.4989 Ops/s $\textbf{\color{#35bf28}+9.01\%}$
test_values[td0_return_estimate-False-False] 0.2353ms 0.1735ms 5.7629 KOps/s 5.2575 KOps/s $\textbf{\color{#35bf28}+9.61\%}$
test_values[td1_return_estimate-False-False] 27.2499ms 23.9680ms 41.7223 Ops/s 40.8059 Ops/s $\color{#35bf28}+2.25\%$
test_values[vec_td1_return_estimate-False-False] 25.8753ms 24.0679ms 41.5491 Ops/s 37.9660 Ops/s $\textbf{\color{#35bf28}+9.44\%}$
test_values[td_lambda_return_estimate-True-False] 35.8623ms 34.1269ms 29.3024 Ops/s 27.8058 Ops/s $\textbf{\color{#35bf28}+5.38\%}$
test_values[vec_td_lambda_return_estimate-True-False] 25.4094ms 23.9362ms 41.7777 Ops/s 38.4223 Ops/s $\textbf{\color{#35bf28}+8.73\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.9268ms 8.4023ms 119.0152 Ops/s 116.5101 Ops/s $\color{#35bf28}+2.15\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.3526ms 1.9712ms 507.3180 Ops/s 526.4818 Ops/s $\color{#d91a1a}-3.64\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5029ms 0.3487ms 2.8682 KOps/s 2.6437 KOps/s $\textbf{\color{#35bf28}+8.49\%}$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 45.2844ms 41.1170ms 24.3209 Ops/s 22.3359 Ops/s $\textbf{\color{#35bf28}+8.89\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.2837ms 3.4162ms 292.7233 Ops/s 284.8988 Ops/s $\color{#35bf28}+2.75\%$
test_dqn_speed[False-None] 5.8654ms 1.3562ms 737.3630 Ops/s 703.9991 Ops/s $\color{#35bf28}+4.74\%$
test_dqn_speed[False-backward] 1.8876ms 1.8154ms 550.8413 Ops/s 529.6458 Ops/s $\color{#35bf28}+4.00\%$
test_dqn_speed[True-None] 0.7429ms 0.4708ms 2.1240 KOps/s 2.0569 KOps/s $\color{#35bf28}+3.26\%$
test_dqn_speed[True-backward] 0.9509ms 0.8856ms 1.1292 KOps/s 809.6939 Ops/s $\textbf{\color{#35bf28}+39.46\%}$
test_dqn_speed[reduce-overhead-None] 0.7526ms 0.4800ms 2.0834 KOps/s 2.0434 KOps/s $\color{#35bf28}+1.96\%$
test_dqn_speed[reduce-overhead-backward] 0.9467ms 0.8905ms 1.1229 KOps/s 1.0008 KOps/s $\textbf{\color{#35bf28}+12.21\%}$
test_ddpg_speed[False-None] 3.2311ms 2.8137ms 355.4027 Ops/s 343.1125 Ops/s $\color{#35bf28}+3.58\%$
test_ddpg_speed[False-backward] 4.0459ms 3.9277ms 254.5989 Ops/s 247.2694 Ops/s $\color{#35bf28}+2.96\%$
test_ddpg_speed[True-None] 1.6360ms 1.2064ms 828.9080 Ops/s 806.1881 Ops/s $\color{#35bf28}+2.82\%$
test_ddpg_speed[True-backward] 2.1946ms 2.0992ms 476.3811 Ops/s 464.2253 Ops/s $\color{#35bf28}+2.62\%$
test_ddpg_speed[reduce-overhead-None] 1.9567ms 1.2176ms 821.3210 Ops/s 793.7415 Ops/s $\color{#35bf28}+3.47\%$
test_ddpg_speed[reduce-overhead-backward] 2.1515ms 2.0848ms 479.6635 Ops/s 452.0493 Ops/s $\textbf{\color{#35bf28}+6.11\%}$
test_sac_speed[False-None] 8.7628ms 7.7445ms 129.1233 Ops/s 124.6113 Ops/s $\color{#35bf28}+3.62\%$
test_sac_speed[False-backward] 12.6482ms 10.8970ms 91.7683 Ops/s 92.4972 Ops/s $\color{#d91a1a}-0.79\%$
test_sac_speed[True-None] 2.7308ms 2.0964ms 476.9981 Ops/s 468.3258 Ops/s $\color{#35bf28}+1.85\%$
test_sac_speed[True-backward] 4.0289ms 3.7471ms 266.8729 Ops/s 259.6995 Ops/s $\color{#35bf28}+2.76\%$
test_sac_speed[reduce-overhead-None] 2.8794ms 2.1663ms 461.6201 Ops/s 469.7475 Ops/s $\color{#d91a1a}-1.73\%$
test_sac_speed[reduce-overhead-backward] 4.0426ms 3.7523ms 266.5006 Ops/s 259.7143 Ops/s $\color{#35bf28}+2.61\%$
test_redq_speed[False-None] 14.2300ms 12.5819ms 79.4791 Ops/s 74.8614 Ops/s $\textbf{\color{#35bf28}+6.17\%}$
test_redq_speed[False-backward] 22.1609ms 21.6592ms 46.1698 Ops/s 43.9790 Ops/s $\color{#35bf28}+4.98\%$
test_redq_speed[True-None] 5.7303ms 4.7588ms 210.1390 Ops/s 171.6166 Ops/s $\textbf{\color{#35bf28}+22.45\%}$
test_redq_speed[True-backward] 12.8819ms 12.1668ms 82.1908 Ops/s 74.1534 Ops/s $\textbf{\color{#35bf28}+10.84\%}$
test_redq_speed[reduce-overhead-None] 5.5738ms 4.7664ms 209.8026 Ops/s 181.6959 Ops/s $\textbf{\color{#35bf28}+15.47\%}$
test_redq_speed[reduce-overhead-backward] 12.7561ms 12.0306ms 83.1217 Ops/s 81.4517 Ops/s $\color{#35bf28}+2.05\%$
test_redq_deprec_speed[False-None] 14.6017ms 12.4499ms 80.3221 Ops/s 76.8356 Ops/s $\color{#35bf28}+4.54\%$
test_redq_deprec_speed[False-backward] 20.0356ms 18.3099ms 54.6152 Ops/s 52.7918 Ops/s $\color{#35bf28}+3.45\%$
test_redq_deprec_speed[True-None] 6.8652ms 3.8080ms 262.6045 Ops/s 260.9034 Ops/s $\color{#35bf28}+0.65\%$
test_redq_deprec_speed[True-backward] 10.1473ms 8.8451ms 113.0565 Ops/s 121.1085 Ops/s $\textbf{\color{#d91a1a}-6.65\%}$
test_redq_deprec_speed[reduce-overhead-None] 4.7376ms 3.8182ms 261.9016 Ops/s 259.4054 Ops/s $\color{#35bf28}+0.96\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.6181ms 8.3238ms 120.1368 Ops/s 110.2838 Ops/s $\textbf{\color{#35bf28}+8.93\%}$
test_td3_speed[False-None] 8.3528ms 7.9312ms 126.0841 Ops/s 121.6332 Ops/s $\color{#35bf28}+3.66\%$
test_td3_speed[False-backward] 11.2704ms 10.3454ms 96.6612 Ops/s 94.9874 Ops/s $\color{#35bf28}+1.76\%$
test_td3_speed[True-None] 1.9592ms 1.7630ms 567.2116 Ops/s 554.5484 Ops/s $\color{#35bf28}+2.28\%$
test_td3_speed[True-backward] 3.5727ms 3.3669ms 297.0100 Ops/s 279.0011 Ops/s $\textbf{\color{#35bf28}+6.45\%}$
test_td3_speed[reduce-overhead-None] 1.9332ms 1.7469ms 572.4483 Ops/s 551.7789 Ops/s $\color{#35bf28}+3.75\%$
test_td3_speed[reduce-overhead-backward] 3.4269ms 3.3231ms 300.9247 Ops/s 280.9194 Ops/s $\textbf{\color{#35bf28}+7.12\%}$
test_cql_speed[False-None] 37.0378ms 35.5116ms 28.1598 Ops/s 26.4720 Ops/s $\textbf{\color{#35bf28}+6.38\%}$
test_cql_speed[False-backward] 46.9705ms 45.4388ms 22.0076 Ops/s 21.0305 Ops/s $\color{#35bf28}+4.65\%$
test_cql_speed[True-None] 17.0303ms 15.7712ms 63.4066 Ops/s 61.6603 Ops/s $\color{#35bf28}+2.83\%$
test_cql_speed[True-backward] 24.7522ms 23.2098ms 43.0852 Ops/s 42.1721 Ops/s $\color{#35bf28}+2.17\%$
test_cql_speed[reduce-overhead-None] 17.4957ms 16.1880ms 61.7743 Ops/s 60.5579 Ops/s $\color{#35bf28}+2.01\%$
test_cql_speed[reduce-overhead-backward] 24.0691ms 22.6713ms 44.1086 Ops/s 43.4712 Ops/s $\color{#35bf28}+1.47\%$
test_a2c_speed[False-None] 8.6445ms 7.1774ms 139.3263 Ops/s 137.1193 Ops/s $\color{#35bf28}+1.61\%$
test_a2c_speed[False-backward] 15.6498ms 14.3767ms 69.5570 Ops/s 66.6147 Ops/s $\color{#35bf28}+4.42\%$
test_a2c_speed[True-None] 4.0068ms 3.7040ms 269.9758 Ops/s 258.7114 Ops/s $\color{#35bf28}+4.35\%$
test_a2c_speed[True-backward] 11.6051ms 11.0556ms 90.4519 Ops/s 97.7384 Ops/s $\textbf{\color{#d91a1a}-7.46\%}$
test_a2c_speed[reduce-overhead-None] 4.1137ms 3.7581ms 266.0952 Ops/s 261.9654 Ops/s $\color{#35bf28}+1.58\%$
test_a2c_speed[reduce-overhead-backward] 11.2432ms 10.8584ms 92.0945 Ops/s 96.6996 Ops/s $\color{#d91a1a}-4.76\%$
test_ppo_speed[False-None] 9.2786ms 7.8286ms 127.7369 Ops/s 131.8385 Ops/s $\color{#d91a1a}-3.11\%$
test_ppo_speed[False-backward] 15.9632ms 15.3838ms 65.0033 Ops/s 66.9888 Ops/s $\color{#d91a1a}-2.96\%$
test_ppo_speed[True-None] 5.1557ms 4.3423ms 230.2939 Ops/s 243.1451 Ops/s $\textbf{\color{#d91a1a}-5.29\%}$
test_ppo_speed[True-backward] 11.2540ms 10.6523ms 93.8762 Ops/s 99.6600 Ops/s $\textbf{\color{#d91a1a}-5.80\%}$
test_ppo_speed[reduce-overhead-None] 5.0165ms 4.1595ms 240.4137 Ops/s 242.5578 Ops/s $\color{#d91a1a}-0.88\%$
test_ppo_speed[reduce-overhead-backward] 11.0867ms 10.7143ms 93.3336 Ops/s 99.0237 Ops/s $\textbf{\color{#d91a1a}-5.75\%}$
test_reinforce_speed[False-None] 7.2915ms 6.7012ms 149.2277 Ops/s 143.3273 Ops/s $\color{#35bf28}+4.12\%$
test_reinforce_speed[False-backward] 10.8435ms 10.1698ms 98.3305 Ops/s 96.7823 Ops/s $\color{#35bf28}+1.60\%$
test_reinforce_speed[True-None] 3.5721ms 3.2264ms 309.9446 Ops/s 322.2755 Ops/s $\color{#d91a1a}-3.83\%$
test_reinforce_speed[True-backward] 10.5935ms 9.6180ms 103.9712 Ops/s 108.4169 Ops/s $\color{#d91a1a}-4.10\%$
test_reinforce_speed[reduce-overhead-None] 3.5531ms 3.1483ms 317.6280 Ops/s 319.6585 Ops/s $\color{#d91a1a}-0.64\%$
test_reinforce_speed[reduce-overhead-backward] 10.6530ms 9.5949ms 104.2220 Ops/s 109.4651 Ops/s $\color{#d91a1a}-4.79\%$
test_iql_speed[False-None] 0.3027s 42.4631ms 23.5498 Ops/s 30.2751 Ops/s $\textbf{\color{#d91a1a}-22.21\%}$
test_iql_speed[False-backward] 49.1893ms 46.6239ms 21.4482 Ops/s 21.3710 Ops/s $\color{#35bf28}+0.36\%$
test_iql_speed[True-None] 12.3190ms 11.7071ms 85.4185 Ops/s 87.8542 Ops/s $\color{#d91a1a}-2.77\%$
test_iql_speed[True-backward] 24.1112ms 23.3707ms 42.7887 Ops/s 44.1111 Ops/s $\color{#d91a1a}-3.00\%$
test_iql_speed[reduce-overhead-None] 12.8476ms 11.5811ms 86.3472 Ops/s 86.7950 Ops/s $\color{#d91a1a}-0.52\%$
test_iql_speed[reduce-overhead-backward] 23.1820ms 21.9583ms 45.5410 Ops/s 45.1294 Ops/s $\color{#35bf28}+0.91\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.9563ms 4.8139ms 207.7317 Ops/s 202.0935 Ops/s $\color{#35bf28}+2.79\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8252ms 0.5091ms 1.9643 KOps/s 1.9316 KOps/s $\color{#35bf28}+1.69\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 1.3381ms 0.5317ms 1.8806 KOps/s 2.0286 KOps/s $\textbf{\color{#d91a1a}-7.29\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.5961ms 4.5318ms 220.6618 Ops/s 212.7086 Ops/s $\color{#35bf28}+3.74\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.1563ms 0.4996ms 2.0017 KOps/s 1.9529 KOps/s $\color{#35bf28}+2.50\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7879ms 0.4807ms 2.0803 KOps/s 2.0762 KOps/s $\color{#35bf28}+0.20\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.3772ms 1.6207ms 617.0226 Ops/s 598.2505 Ops/s $\color{#35bf28}+3.14\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.9732ms 1.5318ms 652.8306 Ops/s 627.5678 Ops/s $\color{#35bf28}+4.03\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.0669ms 4.6995ms 212.7907 Ops/s 204.0973 Ops/s $\color{#35bf28}+4.26\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0422ms 0.6416ms 1.5585 KOps/s 1.4867 KOps/s $\color{#35bf28}+4.83\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.2441ms 0.6204ms 1.6119 KOps/s 1.5400 KOps/s $\color{#35bf28}+4.67\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.5914ms 4.6178ms 216.5533 Ops/s 198.6210 Ops/s $\textbf{\color{#35bf28}+9.03\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8601ms 0.5140ms 1.9455 KOps/s 1.8584 KOps/s $\color{#35bf28}+4.69\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6616ms 0.4812ms 2.0781 KOps/s 1.9912 KOps/s $\color{#35bf28}+4.37\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.1031ms 4.5104ms 221.7098 Ops/s 202.3634 Ops/s $\textbf{\color{#35bf28}+9.56\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8310ms 0.4934ms 2.0267 KOps/s 1.9147 KOps/s $\textbf{\color{#35bf28}+5.85\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8019ms 0.4836ms 2.0679 KOps/s 2.0934 KOps/s $\color{#d91a1a}-1.22\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.2941ms 4.6956ms 212.9676 Ops/s 207.7615 Ops/s $\color{#35bf28}+2.51\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9869ms 0.6716ms 1.4890 KOps/s 1.5099 KOps/s $\color{#d91a1a}-1.38\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9920ms 0.6439ms 1.5529 KOps/s 1.5717 KOps/s $\color{#d91a1a}-1.19\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.6055ms 4.1073ms 243.4676 Ops/s 246.3267 Ops/s $\color{#d91a1a}-1.16\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 7.9627ms 2.3369ms 427.9265 Ops/s 427.3754 Ops/s $\color{#35bf28}+0.13\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.7042ms 1.2257ms 815.8375 Ops/s 734.3005 Ops/s $\textbf{\color{#35bf28}+11.10\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 5.5406ms 4.2192ms 237.0142 Ops/s 35.4066 Ops/s $\textbf{\color{#35bf28}+569.41\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 5.3676ms 2.2727ms 439.9967 Ops/s 417.7665 Ops/s $\textbf{\color{#35bf28}+5.32\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.0959ms 1.3308ms 751.4453 Ops/s 724.7209 Ops/s $\color{#35bf28}+3.69\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4033s 12.3187ms 81.1772 Ops/s 225.0357 Ops/s $\textbf{\color{#d91a1a}-63.93\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.8349ms 2.4911ms 401.4239 Ops/s 405.6970 Ops/s $\color{#d91a1a}-1.05\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 4.9457ms 1.4436ms 692.6951 Ops/s 701.6698 Ops/s $\color{#d91a1a}-1.28\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 14.6765ms 11.2385ms 88.9799 Ops/s 79.1863 Ops/s $\textbf{\color{#35bf28}+12.37\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.6979ms 13.9734ms 71.5647 Ops/s 69.9064 Ops/s $\color{#35bf28}+2.37\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.7985ms 20.1802ms 49.5536 Ops/s 47.2498 Ops/s $\color{#35bf28}+4.88\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 15.1071ms 14.0779ms 71.0331 Ops/s 66.9293 Ops/s $\textbf{\color{#35bf28}+6.13\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 20.5144ms 20.0805ms 49.7994 Ops/s 47.1308 Ops/s $\textbf{\color{#35bf28}+5.66\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 20.0622ms 15.5139ms 64.4584 Ops/s 60.4313 Ops/s $\textbf{\color{#35bf28}+6.66\%}$

Copy link

github-actions bot commented Feb 3, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}37$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.8441s 0.7522s 1.3295 Ops/s 1.3711 Ops/s $\color{#d91a1a}-3.03\%$
test_transformed 1.4311s 1.3426s 0.7448 Ops/s 0.7286 Ops/s $\color{#35bf28}+2.23\%$
test_serial 2.1717s 2.1687s 0.4611 Ops/s 0.4568 Ops/s $\color{#35bf28}+0.94\%$
test_parallel 1.8584s 1.8343s 0.5452 Ops/s 0.5372 Ops/s $\color{#35bf28}+1.49\%$
test_step_mdp_speed[True-True-True-True-True] 0.1578ms 40.1093μs 24.9319 KOps/s 24.8080 KOps/s $\color{#35bf28}+0.50\%$
test_step_mdp_speed[True-True-True-True-False] 0.1467ms 23.1848μs 43.1317 KOps/s 42.4949 KOps/s $\color{#35bf28}+1.50\%$
test_step_mdp_speed[True-True-True-False-True] 0.1524ms 22.3886μs 44.6655 KOps/s 44.5037 KOps/s $\color{#35bf28}+0.36\%$
test_step_mdp_speed[True-True-True-False-False] 41.4510μs 12.9463μs 77.2421 KOps/s 76.2462 KOps/s $\color{#35bf28}+1.31\%$
test_step_mdp_speed[True-True-False-True-True] 0.1270ms 42.7007μs 23.4188 KOps/s 23.4292 KOps/s $\color{#d91a1a}-0.04\%$
test_step_mdp_speed[True-True-False-True-False] 73.7010μs 26.1632μs 38.2216 KOps/s 38.5127 KOps/s $\color{#d91a1a}-0.76\%$
test_step_mdp_speed[True-True-False-False-True] 55.9010μs 25.1648μs 39.7381 KOps/s 40.2439 KOps/s $\color{#d91a1a}-1.26\%$
test_step_mdp_speed[True-True-False-False-False] 60.3010μs 15.5898μs 64.1445 KOps/s 64.3263 KOps/s $\color{#d91a1a}-0.28\%$
test_step_mdp_speed[True-False-True-True-True] 92.6520μs 45.4731μs 21.9910 KOps/s 22.7594 KOps/s $\color{#d91a1a}-3.38\%$
test_step_mdp_speed[True-False-True-True-False] 76.6910μs 28.2056μs 35.4539 KOps/s 35.4536 KOps/s $+0.00\%$
test_step_mdp_speed[True-False-True-False-True] 0.1584ms 24.8292μs 40.2752 KOps/s 40.8079 KOps/s $\color{#d91a1a}-1.31\%$
test_step_mdp_speed[True-False-True-False-False] 41.9910μs 15.6874μs 63.7456 KOps/s 64.3849 KOps/s $\color{#d91a1a}-0.99\%$
test_step_mdp_speed[True-False-False-True-True] 0.2219ms 47.7369μs 20.9482 KOps/s 21.1391 KOps/s $\color{#d91a1a}-0.90\%$
test_step_mdp_speed[True-False-False-True-False] 86.0410μs 31.2461μs 32.0040 KOps/s 32.7154 KOps/s $\color{#d91a1a}-2.17\%$
test_step_mdp_speed[True-False-False-False-True] 0.2029ms 27.4993μs 36.3646 KOps/s 37.4978 KOps/s $\color{#d91a1a}-3.02\%$
test_step_mdp_speed[True-False-False-False-False] 0.2057ms 17.7899μs 56.2118 KOps/s 58.6353 KOps/s $\color{#d91a1a}-4.13\%$
test_step_mdp_speed[False-True-True-True-True] 0.2240ms 45.3431μs 22.0541 KOps/s 22.3072 KOps/s $\color{#d91a1a}-1.13\%$
test_step_mdp_speed[False-True-True-True-False] 0.2123ms 28.6262μs 34.9330 KOps/s 35.3330 KOps/s $\color{#d91a1a}-1.13\%$
test_step_mdp_speed[False-True-True-False-True] 2.7151ms 29.3032μs 34.1260 KOps/s 34.6669 KOps/s $\color{#d91a1a}-1.56\%$
test_step_mdp_speed[False-True-True-False-False] 0.1450ms 17.2253μs 58.0540 KOps/s 58.4387 KOps/s $\color{#d91a1a}-0.66\%$
test_step_mdp_speed[False-True-False-True-True] 81.1810μs 47.5370μs 21.0362 KOps/s 21.1279 KOps/s $\color{#d91a1a}-0.43\%$
test_step_mdp_speed[False-True-False-True-False] 60.8410μs 31.1018μs 32.1524 KOps/s 32.7699 KOps/s $\color{#d91a1a}-1.88\%$
test_step_mdp_speed[False-True-False-False-True] 63.1010μs 31.4361μs 31.8106 KOps/s 33.5872 KOps/s $\textbf{\color{#d91a1a}-5.29\%}$
test_step_mdp_speed[False-True-False-False-False] 54.8110μs 19.6666μs 50.8477 KOps/s 51.8305 KOps/s $\color{#d91a1a}-1.90\%$
test_step_mdp_speed[False-False-True-True-True] 91.7120μs 50.1473μs 19.9413 KOps/s 20.0299 KOps/s $\color{#d91a1a}-0.44\%$
test_step_mdp_speed[False-False-True-True-False] 82.1220μs 33.0346μs 30.2713 KOps/s 30.0581 KOps/s $\color{#35bf28}+0.71\%$
test_step_mdp_speed[False-False-True-False-True] 74.7910μs 31.2432μs 32.0070 KOps/s 32.4285 KOps/s $\color{#d91a1a}-1.30\%$
test_step_mdp_speed[False-False-True-False-False] 68.4020μs 19.8060μs 50.4898 KOps/s 52.8270 KOps/s $\color{#d91a1a}-4.42\%$
test_step_mdp_speed[False-False-False-True-True] 0.1026ms 52.0842μs 19.1997 KOps/s 19.7494 KOps/s $\color{#d91a1a}-2.78\%$
test_step_mdp_speed[False-False-False-True-False] 68.0410μs 35.5357μs 28.1407 KOps/s 29.6127 KOps/s $\color{#d91a1a}-4.97\%$
test_step_mdp_speed[False-False-False-False-True] 84.0420μs 33.0409μs 30.2655 KOps/s 31.1308 KOps/s $\color{#d91a1a}-2.78\%$
test_step_mdp_speed[False-False-False-False-False] 63.0710μs 21.8631μs 45.7392 KOps/s 46.1339 KOps/s $\color{#d91a1a}-0.86\%$
test_values[generalized_advantage_estimate-True-True] 25.4062ms 25.0982ms 39.8435 Ops/s 39.9513 Ops/s $\color{#d91a1a}-0.27\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1244s 3.3920ms 294.8148 Ops/s 315.6380 Ops/s $\textbf{\color{#d91a1a}-6.60\%}$
test_values[td0_return_estimate-False-False] 0.1130ms 80.7630μs 12.3819 KOps/s 12.1674 KOps/s $\color{#35bf28}+1.76\%$
test_values[td1_return_estimate-False-False] 56.4206ms 55.9503ms 17.8730 Ops/s 17.9254 Ops/s $\color{#d91a1a}-0.29\%$
test_values[vec_td1_return_estimate-False-False] 1.3115ms 1.0888ms 918.4617 Ops/s 918.3990 Ops/s $+0.01\%$
test_values[td_lambda_return_estimate-True-False] 88.9123ms 88.5336ms 11.2951 Ops/s 11.3272 Ops/s $\color{#d91a1a}-0.28\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.4558ms 1.1001ms 909.0062 Ops/s 923.0489 Ops/s $\color{#d91a1a}-1.52\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.3682ms 25.0790ms 39.8740 Ops/s 40.0986 Ops/s $\color{#d91a1a}-0.56\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0290ms 0.7605ms 1.3149 KOps/s 1.3112 KOps/s $\color{#35bf28}+0.28\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8505ms 0.6764ms 1.4785 KOps/s 1.4850 KOps/s $\color{#d91a1a}-0.44\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6810ms 1.4969ms 668.0589 Ops/s 671.6138 Ops/s $\color{#d91a1a}-0.53\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8462ms 0.6911ms 1.4470 KOps/s 1.4552 KOps/s $\color{#d91a1a}-0.56\%$
test_dqn_speed[False-None] 1.8226ms 1.5335ms 652.0901 Ops/s 642.7138 Ops/s $\color{#35bf28}+1.46\%$
test_dqn_speed[False-backward] 2.2903ms 2.1600ms 462.9669 Ops/s 461.6137 Ops/s $\color{#35bf28}+0.29\%$
test_dqn_speed[True-None] 0.7381ms 0.5573ms 1.7943 KOps/s 1.7449 KOps/s $\color{#35bf28}+2.83\%$
test_dqn_speed[True-backward] 1.1780ms 1.1235ms 890.0877 Ops/s 789.3191 Ops/s $\textbf{\color{#35bf28}+12.77\%}$
test_dqn_speed[reduce-overhead-None] 0.7666ms 0.5752ms 1.7386 KOps/s 1.7451 KOps/s $\color{#d91a1a}-0.37\%$
test_dqn_speed[reduce-overhead-backward] 1.1313ms 0.9672ms 1.0340 KOps/s 926.9376 Ops/s $\textbf{\color{#35bf28}+11.55\%}$
test_ddpg_speed[False-None] 3.2077ms 2.9046ms 344.2815 Ops/s 340.1494 Ops/s $\color{#35bf28}+1.21\%$
test_ddpg_speed[False-backward] 4.3923ms 4.1500ms 240.9639 Ops/s 231.8789 Ops/s $\color{#35bf28}+3.92\%$
test_ddpg_speed[True-None] 1.7502ms 1.4060ms 711.2306 Ops/s 740.3675 Ops/s $\color{#d91a1a}-3.94\%$
test_ddpg_speed[True-backward] 2.6197ms 2.4417ms 409.5447 Ops/s 381.4025 Ops/s $\textbf{\color{#35bf28}+7.38\%}$
test_ddpg_speed[reduce-overhead-None] 1.5888ms 1.3576ms 736.5688 Ops/s 731.0774 Ops/s $\color{#35bf28}+0.75\%$
test_ddpg_speed[reduce-overhead-backward] 2.0834ms 1.9011ms 526.0030 Ops/s 487.6945 Ops/s $\textbf{\color{#35bf28}+7.86\%}$
test_sac_speed[False-None] 8.4529ms 8.0808ms 123.7503 Ops/s 121.2782 Ops/s $\color{#35bf28}+2.04\%$
test_sac_speed[False-backward] 11.4511ms 10.9994ms 90.9138 Ops/s 87.8690 Ops/s $\color{#35bf28}+3.47\%$
test_sac_speed[True-None] 2.1319ms 1.8466ms 541.5237 Ops/s 532.5124 Ops/s $\color{#35bf28}+1.69\%$
test_sac_speed[True-backward] 3.7413ms 3.5677ms 280.2894 Ops/s 271.4665 Ops/s $\color{#35bf28}+3.25\%$
test_sac_speed[reduce-overhead-None] 24.8811ms 12.1243ms 82.4789 Ops/s 79.3284 Ops/s $\color{#35bf28}+3.97\%$
test_sac_speed[reduce-overhead-backward] 1.7572ms 1.6237ms 615.8939 Ops/s 541.0583 Ops/s $\textbf{\color{#35bf28}+13.83\%}$
test_redq_speed[False-None] 8.0579ms 7.5997ms 131.5844 Ops/s 129.3524 Ops/s $\color{#35bf28}+1.73\%$
test_redq_speed[False-backward] 12.0682ms 11.4528ms 87.3149 Ops/s 83.1796 Ops/s $\color{#35bf28}+4.97\%$
test_redq_speed[True-None] 2.6528ms 2.3200ms 431.0325 Ops/s 425.7757 Ops/s $\color{#35bf28}+1.23\%$
test_redq_speed[True-backward] 4.3450ms 4.0435ms 247.3101 Ops/s 232.3616 Ops/s $\textbf{\color{#35bf28}+6.43\%}$
test_redq_speed[reduce-overhead-None] 2.6254ms 2.3272ms 429.7081 Ops/s 421.5120 Ops/s $\color{#35bf28}+1.94\%$
test_redq_speed[reduce-overhead-backward] 4.3174ms 4.0673ms 245.8620 Ops/s 242.3963 Ops/s $\color{#35bf28}+1.43\%$
test_redq_deprec_speed[False-None] 9.5440ms 9.1351ms 109.4677 Ops/s 107.0025 Ops/s $\color{#35bf28}+2.30\%$
test_redq_deprec_speed[False-backward] 12.5658ms 12.0632ms 82.8964 Ops/s 81.4273 Ops/s $\color{#35bf28}+1.80\%$
test_redq_deprec_speed[True-None] 3.0459ms 2.6624ms 375.5994 Ops/s 361.6306 Ops/s $\color{#35bf28}+3.86\%$
test_redq_deprec_speed[True-backward] 4.7480ms 4.3368ms 230.5849 Ops/s 215.8841 Ops/s $\textbf{\color{#35bf28}+6.81\%}$
test_redq_deprec_speed[reduce-overhead-None] 2.9576ms 2.6490ms 377.4999 Ops/s 360.7866 Ops/s $\color{#35bf28}+4.63\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.7703ms 4.3465ms 230.0723 Ops/s 219.8779 Ops/s $\color{#35bf28}+4.64\%$
test_td3_speed[False-None] 8.4628ms 8.1279ms 123.0335 Ops/s 122.6422 Ops/s $\color{#35bf28}+0.32\%$
test_td3_speed[False-backward] 11.1396ms 10.3282ms 96.8223 Ops/s 93.1890 Ops/s $\color{#35bf28}+3.90\%$
test_td3_speed[True-None] 1.6921ms 1.6611ms 601.9976 Ops/s 580.8377 Ops/s $\color{#35bf28}+3.64\%$
test_td3_speed[True-backward] 3.3656ms 3.2109ms 311.4393 Ops/s 296.6429 Ops/s $\color{#35bf28}+4.99\%$
test_td3_speed[reduce-overhead-None] 55.2658ms 26.6855ms 37.4736 Ops/s 35.2657 Ops/s $\textbf{\color{#35bf28}+6.26\%}$
test_td3_speed[reduce-overhead-backward] 1.4041ms 1.3475ms 742.1043 Ops/s 668.0368 Ops/s $\textbf{\color{#35bf28}+11.09\%}$
test_cql_speed[False-None] 17.4065ms 16.8908ms 59.2038 Ops/s 58.0534 Ops/s $\color{#35bf28}+1.98\%$
test_cql_speed[False-backward] 22.6340ms 22.0232ms 45.4067 Ops/s 43.2460 Ops/s $\color{#35bf28}+5.00\%$
test_cql_speed[True-None] 3.4563ms 3.2876ms 304.1721 Ops/s 299.3649 Ops/s $\color{#35bf28}+1.61\%$
test_cql_speed[True-backward] 5.8375ms 5.5144ms 181.3450 Ops/s 175.4611 Ops/s $\color{#35bf28}+3.35\%$
test_cql_speed[reduce-overhead-None] 21.4314ms 13.3672ms 74.8102 Ops/s 56.5445 Ops/s $\textbf{\color{#35bf28}+32.30\%}$
test_cql_speed[reduce-overhead-backward] 2.1428ms 1.8365ms 544.5243 Ops/s 488.9572 Ops/s $\textbf{\color{#35bf28}+11.36\%}$
test_a2c_speed[False-None] 3.5345ms 3.2076ms 311.7624 Ops/s 307.1936 Ops/s $\color{#35bf28}+1.49\%$
test_a2c_speed[False-backward] 6.7213ms 6.1048ms 163.8052 Ops/s 155.0553 Ops/s $\textbf{\color{#35bf28}+5.64\%}$
test_a2c_speed[True-None] 1.5700ms 1.3528ms 739.2147 Ops/s 729.3161 Ops/s $\color{#35bf28}+1.36\%$
test_a2c_speed[True-backward] 3.0317ms 2.9033ms 344.4400 Ops/s 322.8105 Ops/s $\textbf{\color{#35bf28}+6.70\%}$
test_a2c_speed[reduce-overhead-None] 16.0462ms 9.0827ms 110.0993 Ops/s 107.3969 Ops/s $\color{#35bf28}+2.52\%$
test_a2c_speed[reduce-overhead-backward] 1.5934ms 1.4706ms 680.0154 Ops/s 609.0101 Ops/s $\textbf{\color{#35bf28}+11.66\%}$
test_ppo_speed[False-None] 4.0261ms 3.7191ms 268.8847 Ops/s 264.8621 Ops/s $\color{#35bf28}+1.52\%$
test_ppo_speed[False-backward] 7.2792ms 6.8477ms 146.0353 Ops/s 140.5904 Ops/s $\color{#35bf28}+3.87\%$
test_ppo_speed[True-None] 1.6413ms 1.4254ms 701.5789 Ops/s 695.8521 Ops/s $\color{#35bf28}+0.82\%$
test_ppo_speed[True-backward] 3.3318ms 3.0651ms 326.2576 Ops/s 305.8726 Ops/s $\textbf{\color{#35bf28}+6.66\%}$
test_ppo_speed[reduce-overhead-None] 1.1848ms 0.9792ms 1.0212 KOps/s 1.0126 KOps/s $\color{#35bf28}+0.85\%$
test_ppo_speed[reduce-overhead-backward] 1.5660ms 1.4180ms 705.2074 Ops/s 615.6676 Ops/s $\textbf{\color{#35bf28}+14.54\%}$
test_reinforce_speed[False-None] 2.4911ms 2.3040ms 434.0237 Ops/s 428.9504 Ops/s $\color{#35bf28}+1.18\%$
test_reinforce_speed[False-backward] 3.9461ms 3.3106ms 302.0587 Ops/s 289.6333 Ops/s $\color{#35bf28}+4.29\%$
test_reinforce_speed[True-None] 1.6061ms 1.3019ms 768.1033 Ops/s 750.1849 Ops/s $\color{#35bf28}+2.39\%$
test_reinforce_speed[True-backward] 3.0780ms 2.9356ms 340.6495 Ops/s 324.6402 Ops/s $\color{#35bf28}+4.93\%$
test_reinforce_speed[reduce-overhead-None] 17.9722ms 9.9949ms 100.0508 Ops/s 97.3606 Ops/s $\color{#35bf28}+2.76\%$
test_reinforce_speed[reduce-overhead-backward] 1.6262ms 1.4802ms 675.5790 Ops/s 598.7922 Ops/s $\textbf{\color{#35bf28}+12.82\%}$
test_iql_speed[False-None] 10.0442ms 9.2868ms 107.6792 Ops/s 105.1115 Ops/s $\color{#35bf28}+2.44\%$
test_iql_speed[False-backward] 13.4925ms 12.9125ms 77.4446 Ops/s 74.0599 Ops/s $\color{#35bf28}+4.57\%$
test_iql_speed[True-None] 2.7570ms 2.2530ms 443.8523 Ops/s 431.2539 Ops/s $\color{#35bf28}+2.92\%$
test_iql_speed[True-backward] 5.0278ms 4.7795ms 209.2256 Ops/s 197.8368 Ops/s $\textbf{\color{#35bf28}+5.76\%}$
test_iql_speed[reduce-overhead-None] 18.8473ms 11.1872ms 89.3881 Ops/s 86.9748 Ops/s $\color{#35bf28}+2.77\%$
test_iql_speed[reduce-overhead-backward] 2.0469ms 1.9030ms 525.4775 Ops/s 458.0618 Ops/s $\textbf{\color{#35bf28}+14.72\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.9815ms 6.3401ms 157.7253 Ops/s 153.6226 Ops/s $\color{#35bf28}+2.67\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5116ms 0.2696ms 3.7090 KOps/s 2.9823 KOps/s $\textbf{\color{#35bf28}+24.36\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4942ms 0.2493ms 4.0105 KOps/s 3.1578 KOps/s $\textbf{\color{#35bf28}+27.00\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.6626ms 6.0741ms 164.6332 Ops/s 162.4106 Ops/s $\color{#35bf28}+1.37\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.7859ms 0.3143ms 3.1814 KOps/s 2.8136 KOps/s $\textbf{\color{#35bf28}+13.07\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4627ms 0.2465ms 4.0562 KOps/s 3.3347 KOps/s $\textbf{\color{#35bf28}+21.64\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6119ms 1.3712ms 729.2623 Ops/s 702.0881 Ops/s $\color{#35bf28}+3.87\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4160ms 1.1929ms 838.2847 Ops/s 750.9689 Ops/s $\textbf{\color{#35bf28}+11.63\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.6366ms 6.2827ms 159.1670 Ops/s 156.8467 Ops/s $\color{#35bf28}+1.48\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1567ms 0.4671ms 2.1409 KOps/s 2.3393 KOps/s $\textbf{\color{#d91a1a}-8.48\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7367ms 0.4336ms 2.3064 KOps/s 2.3227 KOps/s $\color{#d91a1a}-0.71\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4347ms 6.0763ms 164.5746 Ops/s 160.2605 Ops/s $\color{#35bf28}+2.69\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8379ms 0.2850ms 3.5091 KOps/s 3.0840 KOps/s $\textbf{\color{#35bf28}+13.78\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4946ms 0.2836ms 3.5263 KOps/s 3.1622 KOps/s $\textbf{\color{#35bf28}+11.51\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4832ms 6.0163ms 166.2146 Ops/s 163.3466 Ops/s $\color{#35bf28}+1.76\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6778ms 0.3311ms 3.0199 KOps/s 2.6198 KOps/s $\textbf{\color{#35bf28}+15.27\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5578ms 0.3034ms 3.2960 KOps/s 2.8636 KOps/s $\textbf{\color{#35bf28}+15.10\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5028ms 6.1985ms 161.3281 Ops/s 155.8192 Ops/s $\color{#35bf28}+3.54\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.8768ms 0.4176ms 2.3947 KOps/s 2.1616 KOps/s $\textbf{\color{#35bf28}+10.78\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.5686ms 0.3991ms 2.5055 KOps/s 2.3078 KOps/s $\textbf{\color{#35bf28}+8.57\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.1426ms 5.5407ms 180.4819 Ops/s 177.1084 Ops/s $\color{#35bf28}+1.90\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 10.1384ms 2.0907ms 478.3043 Ops/s 429.2655 Ops/s $\textbf{\color{#35bf28}+11.42\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.0651ms 1.2362ms 808.9293 Ops/s 852.5024 Ops/s $\textbf{\color{#d91a1a}-5.11\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.1727ms 5.6686ms 176.4108 Ops/s 177.6470 Ops/s $\color{#d91a1a}-0.70\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 10.2500ms 2.1238ms 470.8460 Ops/s 457.7862 Ops/s $\color{#35bf28}+2.85\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 3.4999ms 1.1699ms 854.7457 Ops/s 764.8384 Ops/s $\textbf{\color{#35bf28}+11.76\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4983s 15.7017ms 63.6874 Ops/s 31.1157 Ops/s $\textbf{\color{#35bf28}+104.68\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 4.0738ms 1.8768ms 532.8136 Ops/s 442.2045 Ops/s $\textbf{\color{#35bf28}+20.49\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.2243ms 1.1956ms 836.3939 Ops/s 703.3057 Ops/s $\textbf{\color{#35bf28}+18.92\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.2584ms 12.9275ms 77.3542 Ops/s 73.0578 Ops/s $\textbf{\color{#35bf28}+5.88\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.1795ms 16.6964ms 59.8931 Ops/s 59.8442 Ops/s $\color{#35bf28}+0.08\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 17.8126ms 17.5785ms 56.8878 Ops/s 52.2760 Ops/s $\textbf{\color{#35bf28}+8.82\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 18.3409ms 16.9657ms 58.9426 Ops/s 57.9152 Ops/s $\color{#35bf28}+1.77\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.1539ms 17.5530ms 56.9702 Ops/s 55.3565 Ops/s $\color{#35bf28}+2.92\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 19.3977ms 18.2462ms 54.8061 Ops/s 54.1141 Ops/s $\color{#35bf28}+1.28\%$

[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens merged commit 332485c into gh/vmoens/88/base Feb 4, 2025
27 of 50 checks passed
vmoens added a commit that referenced this pull request Feb 4, 2025
ghstack-source-id: e548bbbb4578d44a8eee000ab0a40c89713afc27
Pull Request resolved: #2745
@vmoens vmoens deleted the gh/vmoens/88/head branch February 4, 2025 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants