Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] NonTensor should not convert anything to numpy #2771

Merged
merged 7 commits into from
Feb 10, 2025

Conversation

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Feb 7, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2771

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens added a commit that referenced this pull request Feb 7, 2025
ghstack-source-id: 1ea6daf2e1253a5db5ef163b85ff8810b84fd19e
Pull Request resolved: #2771
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 7, 2025
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 7, 2025
ghstack-source-id: 6006ff9b7edde96f785a599e30945c1c2d53fa97
Pull Request resolved: #2771
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 7, 2025
ghstack-source-id: 10a19e08499c21b6fbbe55133619a18624d01678
Pull Request resolved: #2771
Copy link

github-actions bot commented Feb 7, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}4$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.5837s 0.5019s 1.9924 Ops/s 2.2196 Ops/s $\textbf{\color{#d91a1a}-10.24\%}$
test_transformed 1.0599s 0.9761s 1.0245 Ops/s 1.1068 Ops/s $\textbf{\color{#d91a1a}-7.44\%}$
test_serial 1.5913s 1.5194s 0.6582 Ops/s 0.7337 Ops/s $\textbf{\color{#d91a1a}-10.29\%}$
test_parallel 1.3432s 1.2602s 0.7935 Ops/s 0.8194 Ops/s $\color{#d91a1a}-3.16\%$
test_step_mdp_speed[True-True-True-True-True] 0.2176ms 30.7612μs 32.5085 KOps/s 31.6143 KOps/s $\color{#35bf28}+2.83\%$
test_step_mdp_speed[True-True-True-True-False] 64.7400μs 18.1829μs 54.9967 KOps/s 53.0104 KOps/s $\color{#35bf28}+3.75\%$
test_step_mdp_speed[True-True-True-False-True] 49.3220μs 17.3461μs 57.6500 KOps/s 58.2822 KOps/s $\color{#d91a1a}-1.08\%$
test_step_mdp_speed[True-True-True-False-False] 39.9350μs 10.1742μs 98.2875 KOps/s 98.3436 KOps/s $\color{#d91a1a}-0.06\%$
test_step_mdp_speed[True-True-False-True-True] 74.4190μs 33.0643μs 30.2441 KOps/s 31.2307 KOps/s $\color{#d91a1a}-3.16\%$
test_step_mdp_speed[True-True-False-True-False] 48.7200μs 20.3965μs 49.0280 KOps/s 51.0091 KOps/s $\color{#d91a1a}-3.88\%$
test_step_mdp_speed[True-True-False-False-True] 45.7960μs 19.4896μs 51.3094 KOps/s 53.0576 KOps/s $\color{#d91a1a}-3.29\%$
test_step_mdp_speed[True-True-False-False-False] 54.0510μs 12.1810μs 82.0953 KOps/s 83.4244 KOps/s $\color{#d91a1a}-1.59\%$
test_step_mdp_speed[True-False-True-True-True] 85.2990μs 35.1586μs 28.4426 KOps/s 29.2476 KOps/s $\color{#d91a1a}-2.75\%$
test_step_mdp_speed[True-False-True-True-False] 57.9680μs 22.1330μs 45.1814 KOps/s 46.4603 KOps/s $\color{#d91a1a}-2.75\%$
test_step_mdp_speed[True-False-True-False-True] 76.3020μs 19.4881μs 51.3135 KOps/s 52.6811 KOps/s $\color{#d91a1a}-2.60\%$
test_step_mdp_speed[True-False-True-False-False] 45.0340μs 12.2692μs 81.5046 KOps/s 83.5003 KOps/s $\color{#d91a1a}-2.39\%$
test_step_mdp_speed[True-False-False-True-True] 88.5650μs 36.5990μs 27.3232 KOps/s 27.9379 KOps/s $\color{#d91a1a}-2.20\%$
test_step_mdp_speed[True-False-False-True-False] 70.2890μs 23.8844μs 41.8683 KOps/s 42.7594 KOps/s $\color{#d91a1a}-2.08\%$
test_step_mdp_speed[True-False-False-False-True] 53.6100μs 21.2627μs 47.0306 KOps/s 47.6911 KOps/s $\color{#d91a1a}-1.38\%$
test_step_mdp_speed[True-False-False-False-False] 64.1590μs 14.1620μs 70.6117 KOps/s 72.9509 KOps/s $\color{#d91a1a}-3.21\%$
test_step_mdp_speed[False-True-True-True-True] 87.5730μs 35.2369μs 28.3793 KOps/s 28.8567 KOps/s $\color{#d91a1a}-1.65\%$
test_step_mdp_speed[False-True-True-True-False] 50.9850μs 22.2221μs 45.0002 KOps/s 45.5131 KOps/s $\color{#d91a1a}-1.13\%$
test_step_mdp_speed[False-True-True-False-True] 76.3020μs 22.5896μs 44.2681 KOps/s 45.3094 KOps/s $\color{#d91a1a}-2.30\%$
test_step_mdp_speed[False-True-True-False-False] 35.0650μs 13.7308μs 72.8288 KOps/s 73.7609 KOps/s $\color{#d91a1a}-1.26\%$
test_step_mdp_speed[False-True-False-True-True] 91.0000μs 36.4474μs 27.4368 KOps/s 27.8950 KOps/s $\color{#d91a1a}-1.64\%$
test_step_mdp_speed[False-True-False-True-False] 76.4020μs 24.1295μs 41.4430 KOps/s 42.6569 KOps/s $\color{#d91a1a}-2.85\%$
test_step_mdp_speed[False-True-False-False-True] 3.0127ms 24.6394μs 40.5854 KOps/s 41.7059 KOps/s $\color{#d91a1a}-2.69\%$
test_step_mdp_speed[False-True-False-False-False] 60.7030μs 15.6621μs 63.8486 KOps/s 65.1523 KOps/s $\color{#d91a1a}-2.00\%$
test_step_mdp_speed[False-False-True-True-True] 76.7920μs 38.5387μs 25.9479 KOps/s 26.1705 KOps/s $\color{#d91a1a}-0.85\%$
test_step_mdp_speed[False-False-True-True-False] 75.1690μs 25.7913μs 38.7728 KOps/s 39.3940 KOps/s $\color{#d91a1a}-1.58\%$
test_step_mdp_speed[False-False-True-False-True] 71.2620μs 24.5728μs 40.6954 KOps/s 42.3394 KOps/s $\color{#d91a1a}-3.88\%$
test_step_mdp_speed[False-False-True-False-False] 46.6470μs 15.6023μs 64.0932 KOps/s 65.5762 KOps/s $\color{#d91a1a}-2.26\%$
test_step_mdp_speed[False-False-False-True-True] 76.7430μs 40.4003μs 24.7523 KOps/s 25.5356 KOps/s $\color{#d91a1a}-3.07\%$
test_step_mdp_speed[False-False-False-True-False] 74.9600μs 27.6993μs 36.1020 KOps/s 37.0815 KOps/s $\color{#d91a1a}-2.64\%$
test_step_mdp_speed[False-False-False-False-True] 59.5310μs 25.8922μs 38.6217 KOps/s 39.5695 KOps/s $\color{#d91a1a}-2.40\%$
test_step_mdp_speed[False-False-False-False-False] 62.7870μs 17.2431μs 57.9941 KOps/s 59.9580 KOps/s $\color{#d91a1a}-3.28\%$
test_values[generalized_advantage_estimate-True-True] 10.3222ms 9.8761ms 101.2548 Ops/s 104.7741 Ops/s $\color{#d91a1a}-3.36\%$
test_values[vec_generalized_advantage_estimate-True-True] 26.4938ms 23.9639ms 41.7294 Ops/s 41.3979 Ops/s $\color{#35bf28}+0.80\%$
test_values[td0_return_estimate-False-False] 0.2257ms 0.1748ms 5.7216 KOps/s 5.5105 KOps/s $\color{#35bf28}+3.83\%$
test_values[td1_return_estimate-False-False] 27.1378ms 24.6648ms 40.5436 Ops/s 40.7116 Ops/s $\color{#d91a1a}-0.41\%$
test_values[vec_td1_return_estimate-False-False] 26.9320ms 24.2659ms 41.2101 Ops/s 41.3035 Ops/s $\color{#d91a1a}-0.23\%$
test_values[td_lambda_return_estimate-True-False] 37.1964ms 35.1562ms 28.4445 Ops/s 28.7193 Ops/s $\color{#d91a1a}-0.96\%$
test_values[vec_td_lambda_return_estimate-True-False] 25.8456ms 24.2047ms 41.3142 Ops/s 40.9903 Ops/s $\color{#35bf28}+0.79\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 11.7242ms 8.6738ms 115.2900 Ops/s 120.6847 Ops/s $\color{#d91a1a}-4.47\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.2315ms 1.9772ms 505.7636 Ops/s 517.9140 Ops/s $\color{#d91a1a}-2.35\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4403ms 0.3650ms 2.7397 KOps/s 2.7427 KOps/s $\color{#d91a1a}-0.11\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 43.7684ms 42.9358ms 23.2906 Ops/s 22.7657 Ops/s $\color{#35bf28}+2.31\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.9942ms 3.5128ms 284.6771 Ops/s 283.3596 Ops/s $\color{#35bf28}+0.46\%$
test_dqn_speed[False-None] 5.4350ms 1.4147ms 706.8459 Ops/s 710.3088 Ops/s $\color{#d91a1a}-0.49\%$
test_dqn_speed[False-backward] 1.9884ms 1.9017ms 525.8535 Ops/s 525.7916 Ops/s $\color{#35bf28}+0.01\%$
test_dqn_speed[True-None] 0.6953ms 0.4814ms 2.0774 KOps/s 2.0664 KOps/s $\color{#35bf28}+0.53\%$
test_dqn_speed[True-backward] 0.9877ms 0.9025ms 1.1081 KOps/s 1.0948 KOps/s $\color{#35bf28}+1.21\%$
test_dqn_speed[reduce-overhead-None] 0.6363ms 0.4797ms 2.0847 KOps/s 2.0515 KOps/s $\color{#35bf28}+1.62\%$
test_dqn_speed[reduce-overhead-backward] 0.9724ms 0.9089ms 1.1002 KOps/s 1.1110 KOps/s $\color{#d91a1a}-0.97\%$
test_ddpg_speed[False-None] 4.2110ms 2.9075ms 343.9322 Ops/s 348.1706 Ops/s $\color{#d91a1a}-1.22\%$
test_ddpg_speed[False-backward] 4.1778ms 4.0310ms 248.0786 Ops/s 247.5594 Ops/s $\color{#35bf28}+0.21\%$
test_ddpg_speed[True-None] 1.4432ms 1.2298ms 813.1717 Ops/s 806.8796 Ops/s $\color{#35bf28}+0.78\%$
test_ddpg_speed[True-backward] 2.1578ms 2.1157ms 472.6556 Ops/s 434.1089 Ops/s $\textbf{\color{#35bf28}+8.88\%}$
test_ddpg_speed[reduce-overhead-None] 1.7515ms 1.2377ms 807.9259 Ops/s 805.4062 Ops/s $\color{#35bf28}+0.31\%$
test_ddpg_speed[reduce-overhead-backward] 2.4017ms 2.1320ms 469.0329 Ops/s 470.1091 Ops/s $\color{#d91a1a}-0.23\%$
test_sac_speed[False-None] 9.4771ms 8.0581ms 124.0985 Ops/s 125.0107 Ops/s $\color{#d91a1a}-0.73\%$
test_sac_speed[False-backward] 12.0708ms 10.8611ms 92.0717 Ops/s 93.0438 Ops/s $\color{#d91a1a}-1.04\%$
test_sac_speed[True-None] 2.7283ms 2.1066ms 474.6879 Ops/s 477.8744 Ops/s $\color{#d91a1a}-0.67\%$
test_sac_speed[True-backward] 3.8155ms 3.7621ms 265.8068 Ops/s 264.4066 Ops/s $\color{#35bf28}+0.53\%$
test_sac_speed[reduce-overhead-None] 2.3781ms 2.0937ms 477.6154 Ops/s 474.3972 Ops/s $\color{#35bf28}+0.68\%$
test_sac_speed[reduce-overhead-backward] 4.2847ms 3.7923ms 263.6917 Ops/s 266.7722 Ops/s $\color{#d91a1a}-1.15\%$
test_redq_speed[False-None] 14.4753ms 12.8721ms 77.6872 Ops/s 79.1062 Ops/s $\color{#d91a1a}-1.79\%$
test_redq_speed[False-backward] 23.5680ms 22.2016ms 45.0418 Ops/s 45.8815 Ops/s $\color{#d91a1a}-1.83\%$
test_redq_speed[True-None] 5.5121ms 4.8560ms 205.9308 Ops/s 207.6995 Ops/s $\color{#d91a1a}-0.85\%$
test_redq_speed[True-backward] 12.8117ms 12.1784ms 82.1125 Ops/s 82.3255 Ops/s $\color{#d91a1a}-0.26\%$
test_redq_speed[reduce-overhead-None] 5.4918ms 4.7755ms 209.4002 Ops/s 210.5939 Ops/s $\color{#d91a1a}-0.57\%$
test_redq_speed[reduce-overhead-backward] 12.7645ms 12.1085ms 82.5863 Ops/s 83.1245 Ops/s $\color{#d91a1a}-0.65\%$
test_redq_deprec_speed[False-None] 13.7822ms 12.6991ms 78.7455 Ops/s 79.5181 Ops/s $\color{#d91a1a}-0.97\%$
test_redq_deprec_speed[False-backward] 19.9080ms 18.5318ms 53.9612 Ops/s 55.5975 Ops/s $\color{#d91a1a}-2.94\%$
test_redq_deprec_speed[True-None] 4.3899ms 3.8415ms 260.3161 Ops/s 261.9025 Ops/s $\color{#d91a1a}-0.61\%$
test_redq_deprec_speed[True-backward] 8.9690ms 8.2119ms 121.7747 Ops/s 122.7846 Ops/s $\color{#d91a1a}-0.82\%$
test_redq_deprec_speed[reduce-overhead-None] 4.7472ms 3.8422ms 260.2692 Ops/s 262.2231 Ops/s $\color{#d91a1a}-0.75\%$
test_redq_deprec_speed[reduce-overhead-backward] 8.7442ms 8.2274ms 121.5449 Ops/s 121.2832 Ops/s $\color{#35bf28}+0.22\%$
test_td3_speed[False-None] 8.3512ms 8.0602ms 124.0658 Ops/s 123.8500 Ops/s $\color{#35bf28}+0.17\%$
test_td3_speed[False-backward] 10.8592ms 10.4779ms 95.4385 Ops/s 96.3328 Ops/s $\color{#d91a1a}-0.93\%$
test_td3_speed[True-None] 2.0768ms 1.7964ms 556.6617 Ops/s 556.1314 Ops/s $\color{#35bf28}+0.10\%$
test_td3_speed[True-backward] 3.4507ms 3.3768ms 296.1413 Ops/s 289.6863 Ops/s $\color{#35bf28}+2.23\%$
test_td3_speed[reduce-overhead-None] 2.4223ms 1.7974ms 556.3469 Ops/s 553.9591 Ops/s $\color{#35bf28}+0.43\%$
test_td3_speed[reduce-overhead-backward] 4.3011ms 3.4100ms 293.2546 Ops/s 294.3745 Ops/s $\color{#d91a1a}-0.38\%$
test_cql_speed[False-None] 40.2823ms 36.4416ms 27.4412 Ops/s 27.2736 Ops/s $\color{#35bf28}+0.61\%$
test_cql_speed[False-backward] 51.8034ms 47.0800ms 21.2404 Ops/s 21.6321 Ops/s $\color{#d91a1a}-1.81\%$
test_cql_speed[True-None] 17.6490ms 16.0711ms 62.2233 Ops/s 61.5922 Ops/s $\color{#35bf28}+1.02\%$
test_cql_speed[True-backward] 24.2013ms 22.5923ms 44.2628 Ops/s 44.4056 Ops/s $\color{#d91a1a}-0.32\%$
test_cql_speed[reduce-overhead-None] 17.9566ms 16.2194ms 61.6547 Ops/s 62.4691 Ops/s $\color{#d91a1a}-1.30\%$
test_cql_speed[reduce-overhead-backward] 23.9005ms 22.6920ms 44.0683 Ops/s 44.0578 Ops/s $\color{#35bf28}+0.02\%$
test_a2c_speed[False-None] 8.1276ms 7.2312ms 138.2900 Ops/s 139.2614 Ops/s $\color{#d91a1a}-0.70\%$
test_a2c_speed[False-backward] 17.0346ms 14.4479ms 69.2142 Ops/s 70.7425 Ops/s $\color{#d91a1a}-2.16\%$
test_a2c_speed[True-None] 4.2799ms 3.7236ms 268.5574 Ops/s 269.2092 Ops/s $\color{#d91a1a}-0.24\%$
test_a2c_speed[True-backward] 11.2668ms 10.0963ms 99.0464 Ops/s 98.0595 Ops/s $\color{#35bf28}+1.01\%$
test_a2c_speed[reduce-overhead-None] 4.2251ms 3.7376ms 267.5529 Ops/s 268.9603 Ops/s $\color{#d91a1a}-0.52\%$
test_a2c_speed[reduce-overhead-backward] 11.5981ms 10.2727ms 97.3450 Ops/s 98.9362 Ops/s $\color{#d91a1a}-1.61\%$
test_ppo_speed[False-None] 8.5209ms 7.5448ms 132.5409 Ops/s 134.0039 Ops/s $\color{#d91a1a}-1.09\%$
test_ppo_speed[False-backward] 16.2348ms 14.7690ms 67.7093 Ops/s 67.9919 Ops/s $\color{#d91a1a}-0.42\%$
test_ppo_speed[True-None] 4.6015ms 4.1103ms 243.2892 Ops/s 243.2952 Ops/s $-0.00\%$
test_ppo_speed[True-backward] 10.9513ms 9.9524ms 100.4784 Ops/s 100.7241 Ops/s $\color{#d91a1a}-0.24\%$
test_ppo_speed[reduce-overhead-None] 4.6914ms 4.1238ms 242.4965 Ops/s 243.4553 Ops/s $\color{#d91a1a}-0.39\%$
test_ppo_speed[reduce-overhead-backward] 10.7339ms 10.1469ms 98.5523 Ops/s 100.4318 Ops/s $\color{#d91a1a}-1.87\%$
test_reinforce_speed[False-None] 7.9195ms 6.5922ms 151.6943 Ops/s 152.8587 Ops/s $\color{#d91a1a}-0.76\%$
test_reinforce_speed[False-backward] 10.1136ms 9.8639ms 101.3797 Ops/s 102.8890 Ops/s $\color{#d91a1a}-1.47\%$
test_reinforce_speed[True-None] 7.1584ms 3.1135ms 321.1869 Ops/s 326.6854 Ops/s $\color{#d91a1a}-1.68\%$
test_reinforce_speed[True-backward] 9.8762ms 8.9322ms 111.9543 Ops/s 112.1803 Ops/s $\color{#d91a1a}-0.20\%$
test_reinforce_speed[reduce-overhead-None] 3.5829ms 3.0692ms 325.8142 Ops/s 323.8816 Ops/s $\color{#35bf28}+0.60\%$
test_reinforce_speed[reduce-overhead-backward] 9.8202ms 9.0276ms 110.7711 Ops/s 111.3825 Ops/s $\color{#d91a1a}-0.55\%$
test_iql_speed[False-None] 35.0137ms 32.8561ms 30.4358 Ops/s 30.7959 Ops/s $\color{#d91a1a}-1.17\%$
test_iql_speed[False-backward] 48.2954ms 45.5493ms 21.9542 Ops/s 22.1778 Ops/s $\color{#d91a1a}-1.01\%$
test_iql_speed[True-None] 12.7213ms 11.2139ms 89.1751 Ops/s 91.2367 Ops/s $\color{#d91a1a}-2.26\%$
test_iql_speed[True-backward] 23.5957ms 22.1685ms 45.1091 Ops/s 46.3824 Ops/s $\color{#d91a1a}-2.75\%$
test_iql_speed[reduce-overhead-None] 12.6110ms 11.3395ms 88.1876 Ops/s 90.5044 Ops/s $\color{#d91a1a}-2.56\%$
test_iql_speed[reduce-overhead-backward] 23.7751ms 21.9861ms 45.4832 Ops/s 45.6025 Ops/s $\color{#d91a1a}-0.26\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2591ms 4.8627ms 205.6451 Ops/s 206.1664 Ops/s $\color{#d91a1a}-0.25\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7908ms 0.5207ms 1.9206 KOps/s 1.8130 KOps/s $\textbf{\color{#35bf28}+5.94\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7914ms 0.4926ms 2.0298 KOps/s 2.0495 KOps/s $\color{#d91a1a}-0.96\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.4861ms 4.6336ms 215.8129 Ops/s 222.0379 Ops/s $\color{#d91a1a}-2.80\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.8896ms 0.5036ms 1.9858 KOps/s 2.0128 KOps/s $\color{#d91a1a}-1.34\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.9161ms 0.4826ms 2.0720 KOps/s 2.1022 KOps/s $\color{#d91a1a}-1.43\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.4374ms 1.6865ms 592.9510 Ops/s 609.2083 Ops/s $\color{#d91a1a}-2.67\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.3985ms 1.5975ms 625.9807 Ops/s 644.0202 Ops/s $\color{#d91a1a}-2.80\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.4354ms 4.7713ms 209.5860 Ops/s 211.1968 Ops/s $\color{#d91a1a}-0.76\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.4027ms 0.6510ms 1.5361 KOps/s 1.5587 KOps/s $\color{#d91a1a}-1.45\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9591ms 0.6269ms 1.5951 KOps/s 1.6179 KOps/s $\color{#d91a1a}-1.41\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.6359ms 4.8747ms 205.1414 Ops/s 215.9148 Ops/s $\color{#d91a1a}-4.99\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8596ms 0.5178ms 1.9314 KOps/s 1.9661 KOps/s $\color{#d91a1a}-1.76\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6783ms 0.4902ms 2.0400 KOps/s 2.0919 KOps/s $\color{#d91a1a}-2.48\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.8838ms 4.5696ms 218.8379 Ops/s 220.2440 Ops/s $\color{#d91a1a}-0.64\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0715ms 0.5128ms 1.9500 KOps/s 2.0246 KOps/s $\color{#d91a1a}-3.68\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7099ms 0.4817ms 2.0759 KOps/s 2.0833 KOps/s $\color{#d91a1a}-0.36\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.0242ms 4.7467ms 210.6730 Ops/s 209.7460 Ops/s $\color{#35bf28}+0.44\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.6460ms 0.6588ms 1.5180 KOps/s 1.5487 KOps/s $\color{#d91a1a}-1.99\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8345ms 0.6210ms 1.6104 KOps/s 1.5940 KOps/s $\color{#35bf28}+1.03\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.3967ms 4.1603ms 240.3671 Ops/s 238.9076 Ops/s $\color{#35bf28}+0.61\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 7.2190ms 2.3624ms 423.2969 Ops/s 436.8483 Ops/s $\color{#d91a1a}-3.10\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 4.8603ms 1.3000ms 769.2197 Ops/s 716.7355 Ops/s $\textbf{\color{#35bf28}+7.32\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4123s 12.4512ms 80.3133 Ops/s 238.9534 Ops/s $\textbf{\color{#d91a1a}-66.39\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 7.1704ms 2.2842ms 437.7927 Ops/s 436.2193 Ops/s $\color{#35bf28}+0.36\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 3.8848ms 1.2914ms 774.3799 Ops/s 689.9360 Ops/s $\textbf{\color{#35bf28}+12.24\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 8.9451ms 4.5108ms 221.6893 Ops/s 227.6774 Ops/s $\color{#d91a1a}-2.63\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.1026ms 2.4848ms 402.4511 Ops/s 402.7761 Ops/s $\color{#d91a1a}-0.08\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.1466ms 1.4970ms 667.9895 Ops/s 671.9289 Ops/s $\color{#d91a1a}-0.59\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 12.3296ms 12.0509ms 82.9812 Ops/s 83.1331 Ops/s $\color{#d91a1a}-0.18\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.3907ms 14.1751ms 70.5465 Ops/s 70.5549 Ops/s $\color{#d91a1a}-0.01\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 22.6298ms 21.3945ms 46.7410 Ops/s 47.6997 Ops/s $\color{#d91a1a}-2.01\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 15.3471ms 14.4318ms 69.2913 Ops/s 68.8570 Ops/s $\color{#35bf28}+0.63\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 22.8787ms 21.2347ms 47.0928 Ops/s 47.9233 Ops/s $\color{#d91a1a}-1.73\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.2182ms 15.8509ms 63.0879 Ops/s 64.1866 Ops/s $\color{#d91a1a}-1.71\%$

Copy link

github-actions bot commented Feb 7, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}21$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.8679s 0.7799s 1.2823 Ops/s 1.2485 Ops/s $\color{#35bf28}+2.71\%$
test_transformed 1.4200s 1.3352s 0.7490 Ops/s 0.7148 Ops/s $\color{#35bf28}+4.79\%$
test_serial 2.3415s 2.2512s 0.4442 Ops/s 0.4308 Ops/s $\color{#35bf28}+3.11\%$
test_parallel 1.9233s 1.8541s 0.5393 Ops/s 0.5432 Ops/s $\color{#d91a1a}-0.71\%$
test_step_mdp_speed[True-True-True-True-True] 0.2365ms 40.8501μs 24.4798 KOps/s 25.1927 KOps/s $\color{#d91a1a}-2.83\%$
test_step_mdp_speed[True-True-True-True-False] 57.0820μs 23.8079μs 42.0029 KOps/s 42.3180 KOps/s $\color{#d91a1a}-0.74\%$
test_step_mdp_speed[True-True-True-False-True] 1.1630ms 23.0093μs 43.4606 KOps/s 45.2910 KOps/s $\color{#d91a1a}-4.04\%$
test_step_mdp_speed[True-True-True-False-False] 46.1420μs 13.2585μs 75.4232 KOps/s 76.4745 KOps/s $\color{#d91a1a}-1.37\%$
test_step_mdp_speed[True-True-False-True-True] 76.4540μs 42.9813μs 23.2659 KOps/s 23.6537 KOps/s $\color{#d91a1a}-1.64\%$
test_step_mdp_speed[True-True-False-True-False] 60.2820μs 25.8357μs 38.7061 KOps/s 38.7020 KOps/s $\color{#35bf28}+0.01\%$
test_step_mdp_speed[True-True-False-False-True] 59.9430μs 24.8393μs 40.2588 KOps/s 40.4325 KOps/s $\color{#d91a1a}-0.43\%$
test_step_mdp_speed[True-True-False-False-False] 46.9120μs 15.3948μs 64.9570 KOps/s 64.6155 KOps/s $\color{#35bf28}+0.53\%$
test_step_mdp_speed[True-False-True-True-True] 92.2440μs 45.7108μs 21.8767 KOps/s 22.0885 KOps/s $\color{#d91a1a}-0.96\%$
test_step_mdp_speed[True-False-True-True-False] 68.3030μs 27.9851μs 35.7333 KOps/s 35.2477 KOps/s $\color{#35bf28}+1.38\%$
test_step_mdp_speed[True-False-True-False-True] 62.7930μs 25.3206μs 39.4935 KOps/s 39.6049 KOps/s $\color{#d91a1a}-0.28\%$
test_step_mdp_speed[True-False-True-False-False] 58.4430μs 15.5311μs 64.3871 KOps/s 62.2714 KOps/s $\color{#35bf28}+3.40\%$
test_step_mdp_speed[True-False-False-True-True] 81.1630μs 47.3522μs 21.1183 KOps/s 21.2137 KOps/s $\color{#d91a1a}-0.45\%$
test_step_mdp_speed[True-False-False-True-False] 57.7820μs 30.3479μs 32.9512 KOps/s 33.2919 KOps/s $\color{#d91a1a}-1.02\%$
test_step_mdp_speed[True-False-False-False-True] 60.0820μs 27.0633μs 36.9504 KOps/s 37.4249 KOps/s $\color{#d91a1a}-1.27\%$
test_step_mdp_speed[True-False-False-False-False] 56.9830μs 18.2909μs 54.6719 KOps/s 55.6652 KOps/s $\color{#d91a1a}-1.78\%$
test_step_mdp_speed[False-True-True-True-True] 96.1940μs 46.3820μs 21.5601 KOps/s 21.9448 KOps/s $\color{#d91a1a}-1.75\%$
test_step_mdp_speed[False-True-True-True-False] 67.0330μs 29.5236μs 33.8712 KOps/s 34.9287 KOps/s $\color{#d91a1a}-3.03\%$
test_step_mdp_speed[False-True-True-False-True] 92.9040μs 28.2439μs 35.4059 KOps/s 34.5230 KOps/s $\color{#35bf28}+2.56\%$
test_step_mdp_speed[False-True-True-False-False] 48.2220μs 17.3518μs 57.6311 KOps/s 57.4199 KOps/s $\color{#35bf28}+0.37\%$
test_step_mdp_speed[False-True-False-True-True] 84.3440μs 48.0282μs 20.8211 KOps/s 21.0419 KOps/s $\color{#d91a1a}-1.05\%$
test_step_mdp_speed[False-True-False-True-False] 55.5820μs 30.6852μs 32.5890 KOps/s 33.1731 KOps/s $\color{#d91a1a}-1.76\%$
test_step_mdp_speed[False-True-False-False-True] 3.0964ms 31.4938μs 31.7523 KOps/s 31.9108 KOps/s $\color{#d91a1a}-0.50\%$
test_step_mdp_speed[False-True-False-False-False] 61.5230μs 19.8216μs 50.4500 KOps/s 52.0160 KOps/s $\color{#d91a1a}-3.01\%$
test_step_mdp_speed[False-False-True-True-True] 80.5940μs 50.3261μs 19.8704 KOps/s 20.1241 KOps/s $\color{#d91a1a}-1.26\%$
test_step_mdp_speed[False-False-True-True-False] 57.7530μs 33.3390μs 29.9949 KOps/s 30.9974 KOps/s $\color{#d91a1a}-3.23\%$
test_step_mdp_speed[False-False-True-False-True] 61.0530μs 31.3506μs 31.8973 KOps/s 32.7720 KOps/s $\color{#d91a1a}-2.67\%$
test_step_mdp_speed[False-False-True-False-False] 44.8620μs 19.7327μs 50.6772 KOps/s 52.1507 KOps/s $\color{#d91a1a}-2.83\%$
test_step_mdp_speed[False-False-False-True-True] 98.9740μs 52.3379μs 19.1066 KOps/s 19.5648 KOps/s $\color{#d91a1a}-2.34\%$
test_step_mdp_speed[False-False-False-True-False] 69.2630μs 35.7834μs 27.9459 KOps/s 28.9785 KOps/s $\color{#d91a1a}-3.56\%$
test_step_mdp_speed[False-False-False-False-True] 59.7530μs 33.1926μs 30.1272 KOps/s 30.3958 KOps/s $\color{#d91a1a}-0.88\%$
test_step_mdp_speed[False-False-False-False-False] 50.4720μs 22.0580μs 45.3350 KOps/s 45.9800 KOps/s $\color{#d91a1a}-1.40\%$
test_values[generalized_advantage_estimate-True-True] 26.6551ms 25.3826ms 39.3971 Ops/s 40.4757 Ops/s $\color{#d91a1a}-2.66\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1053s 3.0027ms 333.0334 Ops/s 334.3696 Ops/s $\color{#d91a1a}-0.40\%$
test_values[td0_return_estimate-False-False] 0.1055ms 79.7277μs 12.5427 KOps/s 12.5646 KOps/s $\color{#d91a1a}-0.17\%$
test_values[td1_return_estimate-False-False] 58.6016ms 56.3683ms 17.7405 Ops/s 17.5401 Ops/s $\color{#35bf28}+1.14\%$
test_values[vec_td1_return_estimate-False-False] 1.3869ms 1.0885ms 918.7026 Ops/s 922.1784 Ops/s $\color{#d91a1a}-0.38\%$
test_values[td_lambda_return_estimate-True-False] 93.7491ms 89.7737ms 11.1391 Ops/s 11.4473 Ops/s $\color{#d91a1a}-2.69\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3609ms 1.0783ms 927.3520 Ops/s 931.5338 Ops/s $\color{#d91a1a}-0.45\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 26.7931ms 26.3794ms 37.9083 Ops/s 38.8574 Ops/s $\color{#d91a1a}-2.44\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0138ms 0.7499ms 1.3335 KOps/s 1.3376 KOps/s $\color{#d91a1a}-0.30\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7799ms 0.6689ms 1.4951 KOps/s 1.4679 KOps/s $\color{#35bf28}+1.85\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5774ms 1.4846ms 673.5790 Ops/s 671.4161 Ops/s $\color{#35bf28}+0.32\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7847ms 0.7169ms 1.3949 KOps/s 1.4557 KOps/s $\color{#d91a1a}-4.17\%$
test_dqn_speed[False-None] 6.7943ms 1.5216ms 657.2082 Ops/s 650.3059 Ops/s $\color{#35bf28}+1.06\%$
test_dqn_speed[False-backward] 2.2386ms 2.1291ms 469.6818 Ops/s 467.0315 Ops/s $\color{#35bf28}+0.57\%$
test_dqn_speed[True-None] 0.6598ms 0.5546ms 1.8029 KOps/s 1.7101 KOps/s $\textbf{\color{#35bf28}+5.43\%}$
test_dqn_speed[True-backward] 1.2830ms 1.2351ms 809.6449 Ops/s 880.9684 Ops/s $\textbf{\color{#d91a1a}-8.10\%}$
test_dqn_speed[reduce-overhead-None] 0.6412ms 0.5722ms 1.7478 KOps/s 1.7387 KOps/s $\color{#35bf28}+0.52\%$
test_dqn_speed[reduce-overhead-backward] 1.0998ms 1.0610ms 942.5121 Ops/s 999.7311 Ops/s $\textbf{\color{#d91a1a}-5.72\%}$
test_ddpg_speed[False-None] 3.1164ms 2.8634ms 349.2400 Ops/s 343.9785 Ops/s $\color{#35bf28}+1.53\%$
test_ddpg_speed[False-backward] 4.7423ms 4.2647ms 234.4816 Ops/s 237.3905 Ops/s $\color{#d91a1a}-1.23\%$
test_ddpg_speed[True-None] 1.7634ms 1.3458ms 743.0309 Ops/s 743.0453 Ops/s $-0.00\%$
test_ddpg_speed[True-backward] 2.4971ms 2.4299ms 411.5367 Ops/s 408.5428 Ops/s $\color{#35bf28}+0.73\%$
test_ddpg_speed[reduce-overhead-None] 1.7667ms 1.3583ms 736.2191 Ops/s 733.5610 Ops/s $\color{#35bf28}+0.36\%$
test_ddpg_speed[reduce-overhead-backward] 1.9547ms 1.8933ms 528.1845 Ops/s 521.8543 Ops/s $\color{#35bf28}+1.21\%$
test_sac_speed[False-None] 8.4760ms 7.9891ms 125.1712 Ops/s 122.6237 Ops/s $\color{#35bf28}+2.08\%$
test_sac_speed[False-backward] 11.4558ms 10.8556ms 92.1182 Ops/s 90.3828 Ops/s $\color{#35bf28}+1.92\%$
test_sac_speed[True-None] 2.1278ms 1.8511ms 540.2090 Ops/s 539.0405 Ops/s $\color{#35bf28}+0.22\%$
test_sac_speed[True-backward] 4.1705ms 3.7581ms 266.0898 Ops/s 265.0345 Ops/s $\color{#35bf28}+0.40\%$
test_sac_speed[reduce-overhead-None] 21.6611ms 12.1092ms 82.5822 Ops/s 83.3154 Ops/s $\color{#d91a1a}-0.88\%$
test_sac_speed[reduce-overhead-backward] 1.8255ms 1.7778ms 562.4892 Ops/s 545.9912 Ops/s $\color{#35bf28}+3.02\%$
test_redq_speed[False-None] 7.8505ms 7.4028ms 135.0844 Ops/s 132.5098 Ops/s $\color{#35bf28}+1.94\%$
test_redq_speed[False-backward] 12.2431ms 11.5784ms 86.3677 Ops/s 84.9083 Ops/s $\color{#35bf28}+1.72\%$
test_redq_speed[True-None] 2.4331ms 2.3294ms 429.2962 Ops/s 424.1590 Ops/s $\color{#35bf28}+1.21\%$
test_redq_speed[True-backward] 4.6660ms 4.2783ms 233.7367 Ops/s 233.6994 Ops/s $\color{#35bf28}+0.02\%$
test_redq_speed[reduce-overhead-None] 2.5235ms 2.3590ms 423.9152 Ops/s 422.5661 Ops/s $\color{#35bf28}+0.32\%$
test_redq_speed[reduce-overhead-backward] 4.6323ms 4.2408ms 235.8031 Ops/s 233.6441 Ops/s $\color{#35bf28}+0.92\%$
test_redq_deprec_speed[False-None] 9.3474ms 9.0019ms 111.0873 Ops/s 109.0090 Ops/s $\color{#35bf28}+1.91\%$
test_redq_deprec_speed[False-backward] 12.8032ms 12.2886ms 81.3764 Ops/s 80.7546 Ops/s $\color{#35bf28}+0.77\%$
test_redq_deprec_speed[True-None] 2.7441ms 2.6443ms 378.1677 Ops/s 375.1671 Ops/s $\color{#35bf28}+0.80\%$
test_redq_deprec_speed[True-backward] 4.9369ms 4.5472ms 219.9144 Ops/s 216.6817 Ops/s $\color{#35bf28}+1.49\%$
test_redq_deprec_speed[reduce-overhead-None] 2.7518ms 2.6500ms 377.3556 Ops/s 371.8994 Ops/s $\color{#35bf28}+1.47\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.9493ms 4.5309ms 220.7083 Ops/s 220.7365 Ops/s $\color{#d91a1a}-0.01\%$
test_td3_speed[False-None] 7.9156ms 7.8712ms 127.0454 Ops/s 124.8996 Ops/s $\color{#35bf28}+1.72\%$
test_td3_speed[False-backward] 10.8510ms 10.3991ms 96.1621 Ops/s 94.7984 Ops/s $\color{#35bf28}+1.44\%$
test_td3_speed[True-None] 1.6898ms 1.6523ms 605.2144 Ops/s 600.4176 Ops/s $\color{#35bf28}+0.80\%$
test_td3_speed[True-backward] 3.4601ms 3.3614ms 297.4947 Ops/s 294.5412 Ops/s $\color{#35bf28}+1.00\%$
test_td3_speed[reduce-overhead-None] 51.4643ms 26.4006ms 37.8779 Ops/s 38.3315 Ops/s $\color{#d91a1a}-1.18\%$
test_td3_speed[reduce-overhead-backward] 1.4359ms 1.3405ms 745.9892 Ops/s 660.2773 Ops/s $\textbf{\color{#35bf28}+12.98\%}$
test_cql_speed[False-None] 17.1336ms 16.6430ms 60.0854 Ops/s 59.1737 Ops/s $\color{#35bf28}+1.54\%$
test_cql_speed[False-backward] 22.3567ms 21.8240ms 45.8212 Ops/s 44.8601 Ops/s $\color{#35bf28}+2.14\%$
test_cql_speed[True-None] 3.3617ms 3.2764ms 305.2092 Ops/s 302.7064 Ops/s $\color{#35bf28}+0.83\%$
test_cql_speed[True-backward] 5.8483ms 5.7004ms 175.4277 Ops/s 169.7368 Ops/s $\color{#35bf28}+3.35\%$
test_cql_speed[reduce-overhead-None] 20.7876ms 13.1273ms 76.1769 Ops/s 75.1650 Ops/s $\color{#35bf28}+1.35\%$
test_cql_speed[reduce-overhead-backward] 1.9398ms 1.8351ms 544.9284 Ops/s 485.8092 Ops/s $\textbf{\color{#35bf28}+12.17\%}$
test_a2c_speed[False-None] 3.3302ms 3.1606ms 316.3997 Ops/s 308.5011 Ops/s $\color{#35bf28}+2.56\%$
test_a2c_speed[False-backward] 6.5990ms 6.0161ms 166.2203 Ops/s 156.3477 Ops/s $\textbf{\color{#35bf28}+6.31\%}$
test_a2c_speed[True-None] 1.4427ms 1.3536ms 738.7698 Ops/s 730.3721 Ops/s $\color{#35bf28}+1.15\%$
test_a2c_speed[True-backward] 3.0085ms 2.9020ms 344.5914 Ops/s 320.8057 Ops/s $\textbf{\color{#35bf28}+7.41\%}$
test_a2c_speed[reduce-overhead-None] 15.9943ms 9.0679ms 110.2789 Ops/s 111.7971 Ops/s $\color{#d91a1a}-1.36\%$
test_a2c_speed[reduce-overhead-backward] 1.6269ms 1.4792ms 676.0253 Ops/s 614.6056 Ops/s $\textbf{\color{#35bf28}+9.99\%}$
test_ppo_speed[False-None] 3.8706ms 3.6692ms 272.5426 Ops/s 267.7701 Ops/s $\color{#35bf28}+1.78\%$
test_ppo_speed[False-backward] 7.8112ms 6.7802ms 147.4875 Ops/s 139.3933 Ops/s $\textbf{\color{#35bf28}+5.81\%}$
test_ppo_speed[True-None] 1.5150ms 1.4243ms 702.1173 Ops/s 693.8626 Ops/s $\color{#35bf28}+1.19\%$
test_ppo_speed[True-backward] 3.1485ms 3.0528ms 327.5705 Ops/s 305.5370 Ops/s $\textbf{\color{#35bf28}+7.21\%}$
test_ppo_speed[reduce-overhead-None] 1.0601ms 0.9732ms 1.0275 KOps/s 1.0276 KOps/s $\color{#d91a1a}-0.01\%$
test_ppo_speed[reduce-overhead-backward] 1.5424ms 1.4172ms 705.6405 Ops/s 615.4290 Ops/s $\textbf{\color{#35bf28}+14.66\%}$
test_reinforce_speed[False-None] 2.3695ms 2.2599ms 442.5023 Ops/s 431.9361 Ops/s $\color{#35bf28}+2.45\%$
test_reinforce_speed[False-backward] 3.6271ms 3.2442ms 308.2450 Ops/s 287.2784 Ops/s $\textbf{\color{#35bf28}+7.30\%}$
test_reinforce_speed[True-None] 1.4001ms 1.2978ms 770.5187 Ops/s 757.7843 Ops/s $\color{#35bf28}+1.68\%$
test_reinforce_speed[True-backward] 3.0482ms 2.9414ms 339.9702 Ops/s 326.6360 Ops/s $\color{#35bf28}+4.08\%$
test_reinforce_speed[reduce-overhead-None] 17.9597ms 10.0281ms 99.7200 Ops/s 99.0344 Ops/s $\color{#35bf28}+0.69\%$
test_reinforce_speed[reduce-overhead-backward] 1.5650ms 1.5024ms 665.5915 Ops/s 598.5987 Ops/s $\textbf{\color{#35bf28}+11.19\%}$
test_iql_speed[False-None] 9.6053ms 9.1159ms 109.6979 Ops/s 106.7277 Ops/s $\color{#35bf28}+2.78\%$
test_iql_speed[False-backward] 13.2064ms 12.7355ms 78.5207 Ops/s 74.9384 Ops/s $\color{#35bf28}+4.78\%$
test_iql_speed[True-None] 2.3987ms 2.2255ms 449.3367 Ops/s 438.3143 Ops/s $\color{#35bf28}+2.51\%$
test_iql_speed[True-backward] 5.1829ms 4.7641ms 209.9036 Ops/s 198.1833 Ops/s $\textbf{\color{#35bf28}+5.91\%}$
test_iql_speed[reduce-overhead-None] 18.6271ms 11.1192ms 89.9343 Ops/s 88.9664 Ops/s $\color{#35bf28}+1.09\%$
test_iql_speed[reduce-overhead-backward] 2.1426ms 2.0703ms 483.0305 Ops/s 465.9847 Ops/s $\color{#35bf28}+3.66\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.7914ms 6.3539ms 157.3833 Ops/s 155.3039 Ops/s $\color{#35bf28}+1.34\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6498ms 0.3448ms 2.8999 KOps/s 3.8473 KOps/s $\textbf{\color{#d91a1a}-24.62\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6125ms 0.3251ms 3.0763 KOps/s 4.1583 KOps/s $\textbf{\color{#d91a1a}-26.02\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3804ms 6.0669ms 164.8293 Ops/s 162.5656 Ops/s $\color{#35bf28}+1.39\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.7569ms 0.3406ms 2.9359 KOps/s 3.3313 KOps/s $\textbf{\color{#d91a1a}-11.87\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4775ms 0.2427ms 4.1207 KOps/s 3.6674 KOps/s $\textbf{\color{#35bf28}+12.36\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.9029ms 1.4299ms 699.3682 Ops/s 686.4975 Ops/s $\color{#35bf28}+1.87\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.3674ms 1.1539ms 866.6353 Ops/s 735.0331 Ops/s $\textbf{\color{#35bf28}+17.90\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5306ms 6.2928ms 158.9115 Ops/s 158.2944 Ops/s $\color{#35bf28}+0.39\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.8083ms 0.4188ms 2.3877 KOps/s 2.4064 KOps/s $\color{#d91a1a}-0.78\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7836ms 0.4334ms 2.3074 KOps/s 2.5581 KOps/s $\textbf{\color{#d91a1a}-9.80\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 9.9792ms 6.2551ms 159.8684 Ops/s 162.1310 Ops/s $\color{#d91a1a}-1.40\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.6416ms 0.2970ms 3.3674 KOps/s 3.2699 KOps/s $\color{#35bf28}+2.98\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5473ms 0.2687ms 3.7213 KOps/s 3.3518 KOps/s $\textbf{\color{#35bf28}+11.02\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3997ms 6.0317ms 165.7909 Ops/s 162.6797 Ops/s $\color{#35bf28}+1.91\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0358ms 0.3095ms 3.2311 KOps/s 3.4779 KOps/s $\textbf{\color{#d91a1a}-7.10\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5781ms 0.3000ms 3.3329 KOps/s 3.9361 KOps/s $\textbf{\color{#d91a1a}-15.32\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4938ms 6.2687ms 159.5238 Ops/s 159.1608 Ops/s $\color{#35bf28}+0.23\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.1773ms 0.4532ms 2.2063 KOps/s 2.2716 KOps/s $\color{#d91a1a}-2.87\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6942ms 0.4148ms 2.4110 KOps/s 2.5745 KOps/s $\textbf{\color{#d91a1a}-6.35\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.1316ms 5.5499ms 180.1844 Ops/s 180.4807 Ops/s $\color{#d91a1a}-0.16\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.9462ms 1.7822ms 561.0938 Ops/s 428.9513 Ops/s $\textbf{\color{#35bf28}+30.81\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 8.8472ms 1.2707ms 786.9918 Ops/s 807.1497 Ops/s $\color{#d91a1a}-2.50\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 8.6438ms 5.7096ms 175.1434 Ops/s 180.6210 Ops/s $\color{#d91a1a}-3.03\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.7471ms 2.0286ms 492.9399 Ops/s 411.1528 Ops/s $\textbf{\color{#35bf28}+19.89\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.3070ms 1.2422ms 804.9910 Ops/s 830.3922 Ops/s $\color{#d91a1a}-3.06\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5114s 15.9578ms 62.6654 Ops/s 31.5836 Ops/s $\textbf{\color{#35bf28}+98.41\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 10.4001ms 2.2375ms 446.9268 Ops/s 443.4437 Ops/s $\color{#35bf28}+0.79\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 6.8397ms 1.3630ms 733.6647 Ops/s 743.4538 Ops/s $\color{#d91a1a}-1.32\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.7884ms 13.2101ms 75.6998 Ops/s 71.8387 Ops/s $\textbf{\color{#35bf28}+5.37\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.1001ms 16.6478ms 60.0681 Ops/s 52.8216 Ops/s $\textbf{\color{#35bf28}+13.72\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.2887ms 17.7723ms 56.2673 Ops/s 54.5322 Ops/s $\color{#35bf28}+3.18\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 18.6435ms 17.1736ms 58.2289 Ops/s 56.3121 Ops/s $\color{#35bf28}+3.40\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.9845ms 17.6422ms 56.6823 Ops/s 52.8362 Ops/s $\textbf{\color{#35bf28}+7.28\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 19.6061ms 18.3181ms 54.5909 Ops/s 52.9661 Ops/s $\color{#35bf28}+3.07\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 7, 2025
ghstack-source-id: 86c130c602db56f83819fe80af19715c85f2aca4
Pull Request resolved: #2771
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 10, 2025
ghstack-source-id: 5dc074b0709187a1f86959b97d57af1f10844831
Pull Request resolved: #2771
[ghstack-poisoned]
@vmoens vmoens added bug Something isn't working Suitable for minor Suitable to be integrated in minor release (no new feature) labels Feb 10, 2025
[ghstack-poisoned]
@vmoens vmoens merged commit 0f8a761 into gh/vmoens/88/base Feb 10, 2025
vmoens added a commit that referenced this pull request Feb 10, 2025
ghstack-source-id: 7644f6c695490f34d6455703418c59cfa718a9f0
Pull Request resolved: #2771
@vmoens vmoens deleted the gh/vmoens/88/head branch February 10, 2025 12:27
vmoens added a commit that referenced this pull request Feb 10, 2025
ghstack-source-id: 7644f6c695490f34d6455703418c59cfa718a9f0
Pull Request resolved: #2771

(cherry picked from commit 3da2750)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Suitable for minor Suitable to be integrated in minor release (no new feature)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants