-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] OrderedDict for TensorDictSequential #1142
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Dec 17, 2024
ghstack-source-id: b2a0a12301b706b5805ffd0851e3059d670d615f Pull Request resolved: #1142
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Dec 17, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 49.2410μs | 21.0067μs | 47.6038 KOps/s | 49.1444 KOps/s | |
test_plain_set_stack_nested | 42.5480μs | 21.1305μs | 47.3249 KOps/s | 48.7377 KOps/s | |
test_plain_set_nested_inplace | 0.1034ms | 23.1301μs | 43.2337 KOps/s | 44.7778 KOps/s | |
test_plain_set_stack_nested_inplace | 60.5830μs | 22.7945μs | 43.8702 KOps/s | 44.8109 KOps/s | |
test_items | 29.9760μs | 4.1894μs | 238.6980 KOps/s | 238.6961 KOps/s | |
test_items_nested | 0.7149ms | 0.4001ms | 2.4991 KOps/s | 2.4522 KOps/s | |
test_items_nested_locked | 0.5072ms | 0.4002ms | 2.4989 KOps/s | 2.4423 KOps/s | |
test_items_nested_leaf | 0.1228ms | 76.7681μs | 13.0262 KOps/s | 13.1086 KOps/s | |
test_items_stack_nested | 0.6000ms | 0.4017ms | 2.4892 KOps/s | 2.4186 KOps/s | |
test_items_stack_nested_leaf | 0.1417ms | 77.2472μs | 12.9454 KOps/s | 12.4325 KOps/s | |
test_items_stack_nested_locked | 0.5844ms | 0.4006ms | 2.4960 KOps/s | 2.4485 KOps/s | |
test_keys | 25.1170μs | 3.4934μs | 286.2578 KOps/s | 282.3420 KOps/s | |
test_keys_nested | 0.3188ms | 0.1650ms | 6.0608 KOps/s | 5.9750 KOps/s | |
test_keys_nested_locked | 1.9396ms | 0.1701ms | 5.8773 KOps/s | 5.7547 KOps/s | |
test_keys_nested_leaf | 0.2514ms | 0.1425ms | 7.0200 KOps/s | 6.9101 KOps/s | |
test_keys_stack_nested | 0.2443ms | 0.1649ms | 6.0654 KOps/s | 6.0810 KOps/s | |
test_keys_stack_nested_leaf | 0.2664ms | 0.1444ms | 6.9239 KOps/s | 7.0458 KOps/s | |
test_keys_stack_nested_locked | 0.2665ms | 0.1718ms | 5.8211 KOps/s | 5.9153 KOps/s | |
test_values | 8.0168μs | 1.0518μs | 950.7135 KOps/s | 963.4188 KOps/s | |
test_values_nested | 0.1197ms | 62.5079μs | 15.9980 KOps/s | 15.8828 KOps/s | |
test_values_nested_locked | 0.1209ms | 62.7899μs | 15.9261 KOps/s | 15.7224 KOps/s | |
test_values_nested_leaf | 0.1282ms | 71.8425μs | 13.9193 KOps/s | 13.2697 KOps/s | |
test_values_stack_nested | 0.1236ms | 63.7959μs | 15.6750 KOps/s | 15.6796 KOps/s | |
test_values_stack_nested_leaf | 0.1404ms | 72.8941μs | 13.7185 KOps/s | 13.9063 KOps/s | |
test_values_stack_nested_locked | 0.1405ms | 63.3704μs | 15.7802 KOps/s | 15.7522 KOps/s | |
test_membership | 21.7610μs | 0.9144μs | 1.0936 MOps/s | 1.1362 MOps/s | |
test_membership_nested | 23.7140μs | 3.0026μs | 333.0492 KOps/s | 340.3915 KOps/s | |
test_membership_nested_leaf | 39.0030μs | 3.0073μs | 332.5253 KOps/s | 337.8967 KOps/s | |
test_membership_stacked_nested | 23.6840μs | 2.9492μs | 339.0782 KOps/s | 346.3278 KOps/s | |
test_membership_stacked_nested_leaf | 36.7680μs | 3.0228μs | 330.8195 KOps/s | 332.3226 KOps/s | |
test_membership_nested_last | 35.5660μs | 4.3665μs | 229.0162 KOps/s | 226.7968 KOps/s | |
test_membership_nested_leaf_last | 31.3780μs | 4.4041μs | 227.0608 KOps/s | 222.6049 KOps/s | |
test_membership_stacked_nested_last | 40.3050μs | 4.3550μs | 229.6204 KOps/s | 224.3194 KOps/s | |
test_membership_stacked_nested_leaf_last | 23.0330μs | 4.3297μs | 230.9624 KOps/s | 221.9158 KOps/s | |
test_nested_getleaf | 40.9760μs | 10.8956μs | 91.7802 KOps/s | 92.7169 KOps/s | |
test_nested_get | 39.4030μs | 10.4294μs | 95.8827 KOps/s | 97.7346 KOps/s | |
test_stacked_getleaf | 42.8800μs | 10.6914μs | 93.5331 KOps/s | 92.5988 KOps/s | |
test_stacked_get | 53.6090μs | 10.1405μs | 98.6148 KOps/s | 98.5663 KOps/s | |
test_nested_getitemleaf | 39.7040μs | 11.1003μs | 90.0874 KOps/s | 87.5151 KOps/s | |
test_nested_getitem | 57.7570μs | 10.4979μs | 95.2570 KOps/s | 92.3168 KOps/s | |
test_stacked_getitemleaf | 36.2270μs | 11.2841μs | 88.6204 KOps/s | 89.3659 KOps/s | |
test_stacked_getitem | 38.5420μs | 10.3971μs | 96.1805 KOps/s | 95.3516 KOps/s | |
test_lock_nested | 5.0118ms | 0.4642ms | 2.1543 KOps/s | 2.1693 KOps/s | |
test_lock_stack_nested | 0.6737ms | 0.4294ms | 2.3287 KOps/s | 2.3750 KOps/s | |
test_unlock_nested | 0.8089ms | 0.3742ms | 2.6727 KOps/s | 2.6553 KOps/s | |
test_unlock_stack_nested | 0.5193ms | 0.3464ms | 2.8869 KOps/s | 2.9173 KOps/s | |
test_flatten_speed | 0.1993ms | 99.5770μs | 10.0425 KOps/s | 10.0507 KOps/s | |
test_unflatten_speed | 0.9010ms | 0.5226ms | 1.9134 KOps/s | 1.8402 KOps/s | |
test_common_ops | 1.7294ms | 0.7969ms | 1.2549 KOps/s | 1.2759 KOps/s | |
test_creation | 46.3870μs | 2.5237μs | 396.2480 KOps/s | 401.4354 KOps/s | |
test_creation_empty | 40.5850μs | 12.4048μs | 80.6139 KOps/s | 96.8223 KOps/s | |
test_creation_nested_1 | 51.6850μs | 15.4625μs | 64.6728 KOps/s | 74.1362 KOps/s | |
test_creation_nested_2 | 53.6390μs | 20.1573μs | 49.6098 KOps/s | 54.7949 KOps/s | |
test_clone | 73.4570μs | 13.3395μs | 74.9655 KOps/s | 74.4358 KOps/s | |
test_getitem[int] | 1.1212ms | 12.8932μs | 77.5601 KOps/s | 77.9296 KOps/s | |
test_getitem[slice_int] | 0.1427ms | 23.9952μs | 41.6750 KOps/s | 40.0126 KOps/s | |
test_getitem[range] | 0.2179ms | 48.0945μs | 20.7924 KOps/s | 20.4813 KOps/s | |
test_getitem[tuple] | 0.1671ms | 19.8648μs | 50.3402 KOps/s | 49.9962 KOps/s | |
test_getitem[list] | 0.2996ms | 43.4108μs | 23.0357 KOps/s | 22.4479 KOps/s | |
test_setitem_dim[int] | 51.4150μs | 24.8105μs | 40.3054 KOps/s | 37.4248 KOps/s | |
test_setitem_dim[slice_int] | 0.1158ms | 50.9172μs | 19.6397 KOps/s | 18.3843 KOps/s | |
test_setitem_dim[range] | 0.1486ms | 72.6851μs | 13.7580 KOps/s | 13.4679 KOps/s | |
test_setitem_dim[tuple] | 0.1089ms | 39.9670μs | 25.0207 KOps/s | 23.4090 KOps/s | |
test_setitem | 0.1408ms | 21.1043μs | 47.3836 KOps/s | 49.5398 KOps/s | |
test_set | 0.1321ms | 20.5288μs | 48.7120 KOps/s | 51.0017 KOps/s | |
test_set_shared | 2.2572ms | 0.1690ms | 5.9170 KOps/s | 5.8731 KOps/s | |
test_update | 0.8333ms | 23.1988μs | 43.1056 KOps/s | 44.5168 KOps/s | |
test_update_nested | 0.1157ms | 33.0909μs | 30.2198 KOps/s | 29.8616 KOps/s | |
test_update__nested | 0.1804ms | 33.7805μs | 29.6028 KOps/s | 29.0142 KOps/s | |
test_set_nested | 98.8830μs | 22.4721μs | 44.4997 KOps/s | 45.2066 KOps/s | |
test_set_nested_new | 0.1031ms | 27.1130μs | 36.8827 KOps/s | 36.6050 KOps/s | |
test_select | 0.2103ms | 43.6253μs | 22.9225 KOps/s | 22.7095 KOps/s | |
test_select_nested | 0.1298ms | 62.2911μs | 16.0537 KOps/s | 15.6796 KOps/s | |
test_exclude_nested | 0.1621ms | 80.6539μs | 12.3987 KOps/s | 12.1846 KOps/s | |
test_empty[True] | 0.7020ms | 0.4221ms | 2.3691 KOps/s | 2.4223 KOps/s | |
test_empty[False] | 12.3980μs | 1.3877μs | 720.5915 KOps/s | 703.7374 KOps/s | |
test_unbind_speed | 0.3260ms | 0.2683ms | 3.7272 KOps/s | 3.6959 KOps/s | |
test_unbind_speed_stack0 | 0.5689ms | 0.2679ms | 3.7325 KOps/s | 3.7538 KOps/s | |
test_unbind_speed_stack1 | 0.1135s | 0.8138ms | 1.2288 KOps/s | 1.3868 KOps/s | |
test_split | 0.1120s | 1.7752ms | 563.3109 Ops/s | 549.2828 Ops/s | |
test_chunk | 0.1109s | 1.7836ms | 560.6532 Ops/s | 545.8042 Ops/s | |
test_consolidate_njt[False-None] | 8.6703ms | 8.1728ms | 122.3568 Ops/s | 120.9340 Ops/s | |
test_creation[device0] | 0.2759ms | 90.2850μs | 11.0760 KOps/s | 10.8648 KOps/s | |
test_creation_from_tensor | 5.0047ms | 95.7855μs | 10.4400 KOps/s | 9.8117 KOps/s | |
test_add_one[memmap_tensor0] | 0.1873ms | 4.7208μs | 211.8286 KOps/s | 210.0492 KOps/s | |
test_contiguous[memmap_tensor0] | 16.1200μs | 0.5140μs | 1.9454 MOps/s | 1.9263 MOps/s | |
test_stack[memmap_tensor0] | 36.9690μs | 3.3787μs | 295.9681 KOps/s | 298.6310 KOps/s | |
test_memmaptd_index | 0.8384ms | 0.2349ms | 4.2574 KOps/s | 4.2351 KOps/s | |
test_memmaptd_index_astensor | 0.5609ms | 0.3232ms | 3.0944 KOps/s | 3.0731 KOps/s | |
test_memmaptd_index_op | 1.1131ms | 0.5956ms | 1.6791 KOps/s | 1.7206 KOps/s | |
test_serialize_model | 0.1296s | 0.1177s | 8.4926 Ops/s | 8.5343 Ops/s | |
test_serialize_model_pickle | 0.4595s | 0.3928s | 2.5455 Ops/s | 2.4264 Ops/s | |
test_serialize_weights | 0.2223s | 0.1329s | 7.5218 Ops/s | 7.1938 Ops/s | |
test_serialize_weights_returnearly | 0.1732s | 0.1570s | 6.3702 Ops/s | 6.0740 Ops/s | |
test_serialize_weights_pickle | 0.5380s | 0.4490s | 2.2271 Ops/s | 2.5392 Ops/s | |
test_serialize_weights_filesystem | 0.1560s | 0.1451s | 6.8916 Ops/s | 6.9774 Ops/s | |
test_serialize_model_filesystem | 0.1634s | 0.1463s | 6.8370 Ops/s | 5.9159 Ops/s | |
test_reshape_pytree | 72.2540μs | 27.1476μs | 36.8356 KOps/s | 37.5113 KOps/s | |
test_reshape_td | 96.8100μs | 33.5982μs | 29.7635 KOps/s | 29.4207 KOps/s | |
test_view_pytree | 61.4240μs | 26.6600μs | 37.5094 KOps/s | 37.5052 KOps/s | |
test_view_td | 81.9220μs | 38.7997μs | 25.7734 KOps/s | 25.8760 KOps/s | |
test_unbind_pytree | 87.5430μs | 30.1994μs | 33.1132 KOps/s | 33.7303 KOps/s | |
test_unbind_td | 0.3150ms | 40.1409μs | 24.9123 KOps/s | 24.7797 KOps/s | |
test_split_pytree | 73.2650μs | 30.0180μs | 33.3134 KOps/s | 34.0501 KOps/s | |
test_split_td | 0.5027ms | 44.9232μs | 22.2602 KOps/s | 22.3176 KOps/s | |
test_add_pytree | 70.3700μs | 35.0717μs | 28.5130 KOps/s | 27.7969 KOps/s | |
test_add_td | 0.1396ms | 58.0627μs | 17.2228 KOps/s | 17.0775 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1868ms | 62.9824μs | 15.8774 KOps/s | 15.8014 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.4321ms | 0.1702ms | 5.8758 KOps/s | 5.7956 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1208ms | 46.1640μs | 21.6619 KOps/s | 21.7343 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2535ms | 0.1176ms | 8.5002 KOps/s | 8.4349 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 58.9190μs | 25.9031μs | 38.6054 KOps/s | 36.7929 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1385ms | 59.1374μs | 16.9098 KOps/s | 16.5420 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2024ms | 78.0110μs | 12.8187 KOps/s | 12.6088 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1414ms | 68.3555μs | 14.6294 KOps/s | 14.7403 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1991ms | 0.1047ms | 9.5497 KOps/s | 9.4528 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4345ms | 0.2151ms | 4.6486 KOps/s | 4.6545 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1170ms | 45.5150μs | 21.9708 KOps/s | 21.2121 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.5307ms | 65.1008μs | 15.3608 KOps/s | 15.1377 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2492ms | 0.1040ms | 9.6126 KOps/s | 9.8653 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3945ms | 0.1998ms | 5.0039 KOps/s | 4.9375 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4635ms | 0.2328ms | 4.2959 KOps/s | 4.2654 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2174ms | 0.1063ms | 9.4034 KOps/s | 9.4601 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1429ms | 59.4755μs | 16.8136 KOps/s | 16.9507 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.5805ms | 46.9837μs | 21.2840 KOps/s | 21.2132 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 1.3405ms | 0.1610ms | 6.2129 KOps/s | 6.1977 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3164ms | 0.1066ms | 9.3782 KOps/s | 9.8276 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 57.6570μs | 21.4617μs | 46.5947 KOps/s | 48.1433 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1571ms | 64.8621μs | 15.4173 KOps/s | 14.7267 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.3432ms | 80.9343μs | 12.3557 KOps/s | 12.2549 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1484ms | 68.7766μs | 14.5398 KOps/s | 14.4197 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.4064ms | 0.2090ms | 4.7839 KOps/s | 4.9334 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.4033ms | 1.3331ms | 750.1237 Ops/s | 746.2972 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.4039ms | 0.2050ms | 4.8773 KOps/s | 4.9393 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.3134ms | 0.7778ms | 1.2857 KOps/s | 1.2791 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.8036ms | 0.4616ms | 2.1663 KOps/s | 2.2289 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.2872ms | 2.7315ms | 366.1026 Ops/s | 383.1942 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 86.2900μs | 36.4504μs | 27.4346 KOps/s | 27.6810 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5837ms | 32.9069μs | 30.3887 KOps/s | 30.2880 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 68.0170μs | 28.6508μs | 34.9031 KOps/s | 34.0658 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 75.1790μs | 23.2436μs | 43.0226 KOps/s | 42.8638 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1369ms | 29.8120μs | 33.5435 KOps/s | 33.2136 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 84.4970μs | 23.0245μs | 43.4320 KOps/s | 43.4920 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1313ms | 52.2017μs | 19.1564 KOps/s | 19.3734 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.3772ms | 20.3295μs | 49.1895 KOps/s | 51.0980 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 88.0740μs | 44.7969μs | 22.3230 KOps/s | 22.5106 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 74.1380μs | 19.0807μs | 52.4091 KOps/s | 52.7160 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1216ms | 46.0944μs | 21.6946 KOps/s | 22.4190 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 69.4990μs | 18.7593μs | 53.3068 KOps/s | 54.0621 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1238ms | 53.2438μs | 18.7815 KOps/s | 19.1292 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.9250ms | 20.1608μs | 49.6013 KOps/s | 51.2901 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1318ms | 45.6131μs | 21.9235 KOps/s | 22.0609 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 59.0300μs | 18.9536μs | 52.7604 KOps/s | 54.2101 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1175ms | 45.8711μs | 21.8002 KOps/s | 22.0242 KOps/s | |
test_compile_indexing[int-pytree-eager] | 86.0600μs | 19.0107μs | 52.6020 KOps/s | 53.8223 KOps/s | |
test_mod_add[eager] | 0.1088ms | 34.9222μs | 28.6351 KOps/s | 29.5543 KOps/s | |
test_mod_add[compile] | 0.1190ms | 48.2930μs | 20.7069 KOps/s | 20.9982 KOps/s | |
test_mod_add[compile-overhead] | 0.1242ms | 48.8044μs | 20.4899 KOps/s | 20.9286 KOps/s | |
test_mod_wrap[eager] | 0.4121ms | 0.2291ms | 4.3642 KOps/s | 4.3511 KOps/s | |
test_mod_wrap[compile] | 0.3131ms | 0.2071ms | 4.8289 KOps/s | 4.8833 KOps/s | |
test_mod_wrap[compile-overhead] | 0.4043ms | 0.2051ms | 4.8758 KOps/s | 4.9104 KOps/s | |
test_mod_wrap_and_backward[eager] | 14.6355ms | 11.6491ms | 85.8435 Ops/s | 85.8547 Ops/s | |
test_mod_wrap_and_backward[compile] | 20.5389ms | 13.0657ms | 76.5361 Ops/s | 77.8908 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 14.1885ms | 12.2081ms | 81.9128 Ops/s | 76.0860 Ops/s | |
test_seq_add[eager] | 0.2061ms | 0.1201ms | 8.3265 KOps/s | 8.7010 KOps/s | |
test_seq_add[compile] | 0.1589ms | 63.4588μs | 15.7582 KOps/s | 15.9591 KOps/s | |
test_seq_add[compile-overhead] | 0.1611ms | 61.7424μs | 16.1963 KOps/s | 16.3633 KOps/s | |
test_seq_wrap[eager] | 1.2802ms | 0.4543ms | 2.2011 KOps/s | 2.2730 KOps/s | |
test_seq_wrap[compile] | 0.4060ms | 0.2330ms | 4.2915 KOps/s | 4.4298 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4312ms | 0.2293ms | 4.3612 KOps/s | 4.3843 KOps/s | |
test_func_call_runtime[False-eager] | 1.0002ms | 0.5560ms | 1.7986 KOps/s | 1.7903 KOps/s | |
test_func_call_runtime[False-compile] | 0.8768ms | 0.4337ms | 2.3056 KOps/s | 2.3275 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.7567ms | 0.4258ms | 2.3486 KOps/s | 2.3530 KOps/s | |
test_func_call_runtime[True-eager] | 0.9094ms | 0.7746ms | 1.2909 KOps/s | 1.2944 KOps/s | |
test_func_call_runtime[True-compile] | 0.9532ms | 0.4681ms | 2.1365 KOps/s | 2.1350 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5740ms | 0.4623ms | 2.1629 KOps/s | 2.1499 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.6632ms | 0.5500ms | 1.8183 KOps/s | 1.8037 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.5338ms | 0.4245ms | 2.3558 KOps/s | 2.3568 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5516ms | 0.4256ms | 2.3495 KOps/s | 2.3523 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1475ms | 0.9125ms | 1.0959 KOps/s | 1.0957 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.6560ms | 0.4944ms | 2.0226 KOps/s | 2.0271 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.6180ms | 0.4893ms | 2.0438 KOps/s | 2.0474 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.6217ms | 1.9284ms | 518.5553 Ops/s | 522.1539 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8966ms | 0.5258ms | 1.9017 KOps/s | 1.9516 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.9696ms | 0.5259ms | 1.9016 KOps/s | 1.8871 KOps/s | |
test_distributed | 0.3138ms | 0.1318ms | 7.5846 KOps/s | 7.6055 KOps/s | |
test_tdmodule | 0.1026ms | 27.9581μs | 35.7678 KOps/s | 37.5079 KOps/s | |
test_tdmodule_dispatch | 0.1216ms | 49.0247μs | 20.3979 KOps/s | 20.2328 KOps/s | |
test_tdseq | 56.5250μs | 30.5623μs | 32.7200 KOps/s | 38.7305 KOps/s | |
test_tdseq_dispatch | 83.6450μs | 56.4193μs | 17.7244 KOps/s | 19.6336 KOps/s | |
test_instantiation_functorch | 2.3798ms | 1.5296ms | 653.7473 Ops/s | 649.0638 Ops/s | |
test_exec_functorch | 0.4595ms | 0.1810ms | 5.5242 KOps/s | 5.5687 KOps/s | |
test_exec_functional_call | 0.3005ms | 0.1739ms | 5.7503 KOps/s | 5.7122 KOps/s | |
test_exec_td_decorator | 0.5320ms | 0.2359ms | 4.2398 KOps/s | 4.2192 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9470ms | 0.6530ms | 1.5314 KOps/s | 1.5130 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9859ms | 0.6596ms | 1.5160 KOps/s | 1.5155 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.9856ms | 0.5360ms | 1.8657 KOps/s | 1.8762 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8146ms | 0.5304ms | 1.8853 KOps/s | 1.8943 KOps/s | |
test_to_module_speed[True] | 2.1320ms | 1.3481ms | 741.7802 Ops/s | 733.4602 Ops/s | |
test_to_module_speed[False] | 1.7030ms | 1.3142ms | 760.9466 Ops/s | 759.9506 Ops/s | |
test_tc_init | 0.1065ms | 49.0194μs | 20.4001 KOps/s | 20.7558 KOps/s | |
test_tc_init_nested | 0.1749ms | 94.7998μs | 10.5485 KOps/s | 10.6274 KOps/s | |
test_tc_first_layer_tensor | 33.1620μs | 1.5307μs | 653.3135 KOps/s | 566.6876 KOps/s | |
test_tc_first_layer_nontensor | 52.9490μs | 4.7160μs | 212.0463 KOps/s | 209.3287 KOps/s | |
test_tc_second_layer_tensor | 31.8390μs | 2.8360μs | 352.6042 KOps/s | 301.2706 KOps/s | |
test_tc_second_layer_nontensor | 54.7510μs | 6.0222μs | 166.0515 KOps/s | 162.0209 KOps/s | |
test_unbind | 0.2360s | 15.9162ms | 62.8290 Ops/s | 67.9996 Ops/s | |
test_full_like | 9.4367ms | 8.1525ms | 122.6625 Ops/s | 120.3113 Ops/s | |
test_zeros_like | 4.0443ms | 3.1327ms | 319.2102 Ops/s | 317.0574 Ops/s | |
test_ones_like | 4.1311ms | 3.6484ms | 274.0918 Ops/s | 274.9187 Ops/s | |
test_clone | 6.8566ms | 5.7426ms | 174.1358 Ops/s | 179.0243 Ops/s | |
test_squeeze | 62.7970μs | 12.0796μs | 82.7839 KOps/s | 79.2372 KOps/s | |
test_unsqueeze | 0.1543ms | 92.2018μs | 10.8458 KOps/s | 10.6660 KOps/s | |
test_split | 0.5247ms | 0.1984ms | 5.0393 KOps/s | 4.9853 KOps/s | |
test_permute | 0.2854ms | 0.2068ms | 4.8355 KOps/s | 4.7067 KOps/s | |
test_stack | 32.8539ms | 25.9337ms | 38.5598 Ops/s | 36.9071 Ops/s | |
test_cat | 32.4229ms | 26.2306ms | 38.1234 Ops/s | 37.9093 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 36.6500μs | 11.4404μs | 87.4099 KOps/s | 78.9826 KOps/s | |
test_plain_set_stack_nested | 32.7300μs | 11.6939μs | 85.5147 KOps/s | 77.7886 KOps/s | |
test_plain_set_nested_inplace | 37.9900μs | 12.5089μs | 79.9432 KOps/s | 72.3482 KOps/s | |
test_plain_set_stack_nested_inplace | 39.7910μs | 12.6546μs | 79.0224 KOps/s | 71.6456 KOps/s | |
test_items | 24.3510μs | 2.9374μs | 340.4388 KOps/s | 346.3393 KOps/s | |
test_items_nested | 0.4193ms | 0.3608ms | 2.7719 KOps/s | 2.7988 KOps/s | |
test_items_nested_locked | 0.4193ms | 0.3634ms | 2.7519 KOps/s | 2.7802 KOps/s | |
test_items_nested_leaf | 86.6110μs | 59.0272μs | 16.9413 KOps/s | 17.2488 KOps/s | |
test_items_stack_nested | 0.4259ms | 0.3641ms | 2.7469 KOps/s | 2.7453 KOps/s | |
test_items_stack_nested_leaf | 0.1107ms | 60.3563μs | 16.5683 KOps/s | 16.9713 KOps/s | |
test_items_stack_nested_locked | 0.3998ms | 0.3680ms | 2.7176 KOps/s | 2.7642 KOps/s | |
test_keys | 30.9610μs | 3.4907μs | 286.4785 KOps/s | 289.8298 KOps/s | |
test_keys_nested | 0.1279ms | 82.3020μs | 12.1504 KOps/s | 12.2379 KOps/s | |
test_keys_nested_locked | 0.6954ms | 88.1450μs | 11.3449 KOps/s | 11.3893 KOps/s | |
test_keys_nested_leaf | 0.1162ms | 73.5933μs | 13.5882 KOps/s | 13.7661 KOps/s | |
test_keys_stack_nested | 0.1220ms | 82.4269μs | 12.1320 KOps/s | 12.1476 KOps/s | |
test_keys_stack_nested_leaf | 0.1206ms | 73.6395μs | 13.5797 KOps/s | 13.5921 KOps/s | |
test_keys_stack_nested_locked | 0.1471ms | 88.4517μs | 11.3056 KOps/s | 11.3171 KOps/s | |
test_values | 5.8333μs | 0.8494μs | 1.1773 MOps/s | 1.1771 MOps/s | |
test_values_nested | 92.2310μs | 35.1405μs | 28.4572 KOps/s | 29.0452 KOps/s | |
test_values_nested_locked | 64.5800μs | 36.3778μs | 27.4893 KOps/s | 27.8336 KOps/s | |
test_values_nested_leaf | 61.4210μs | 39.7204μs | 25.1760 KOps/s | 25.1896 KOps/s | |
test_values_stack_nested | 63.1800μs | 35.0913μs | 28.4971 KOps/s | 28.5492 KOps/s | |
test_values_stack_nested_leaf | 0.1124ms | 39.9315μs | 25.0429 KOps/s | 24.9816 KOps/s | |
test_values_stack_nested_locked | 64.0910μs | 36.7566μs | 27.2060 KOps/s | 27.3210 KOps/s | |
test_membership | 1.8420μs | 0.5035μs | 1.9860 MOps/s | 1.9558 MOps/s | |
test_membership_nested | 30.3600μs | 2.0988μs | 476.4521 KOps/s | 494.3690 KOps/s | |
test_membership_nested_leaf | 20.2805μs | 2.0541μs | 486.8200 KOps/s | 484.1413 KOps/s | |
test_membership_stacked_nested | 30.4800μs | 2.0989μs | 476.4345 KOps/s | 476.0627 KOps/s | |
test_membership_stacked_nested_leaf | 34.7500μs | 2.0689μs | 483.3583 KOps/s | 482.6695 KOps/s | |
test_membership_nested_last | 39.9400μs | 3.1606μs | 316.3961 KOps/s | 320.5465 KOps/s | |
test_membership_nested_leaf_last | 37.7500μs | 3.1619μs | 316.2680 KOps/s | 322.5746 KOps/s | |
test_membership_stacked_nested_last | 62.9910μs | 8.1878μs | 122.1322 KOps/s | 323.9833 KOps/s | |
test_membership_stacked_nested_leaf_last | 52.3100μs | 8.1870μs | 122.1446 KOps/s | 323.1845 KOps/s | |
test_nested_getleaf | 39.4010μs | 6.1191μs | 163.4240 KOps/s | 160.0595 KOps/s | |
test_nested_get | 34.4900μs | 5.8834μs | 169.9684 KOps/s | 172.1625 KOps/s | |
test_stacked_getleaf | 36.2800μs | 6.1410μs | 162.8386 KOps/s | 162.7804 KOps/s | |
test_stacked_get | 28.7210μs | 5.8367μs | 171.3292 KOps/s | 169.2304 KOps/s | |
test_nested_getitemleaf | 27.7610μs | 6.3269μs | 158.0561 KOps/s | 161.1561 KOps/s | |
test_nested_getitem | 29.7500μs | 5.9595μs | 167.7998 KOps/s | 169.7736 KOps/s | |
test_stacked_getitemleaf | 41.1100μs | 6.2214μs | 160.7358 KOps/s | 159.5481 KOps/s | |
test_stacked_getitem | 26.3410μs | 5.9010μs | 169.4628 KOps/s | 168.3725 KOps/s | |
test_lock_nested | 9.0866ms | 0.3937ms | 2.5400 KOps/s | 2.5768 KOps/s | |
test_lock_stack_nested | 0.4238ms | 0.3480ms | 2.8739 KOps/s | 2.8447 KOps/s | |
test_unlock_nested | 0.6563ms | 0.3225ms | 3.1010 KOps/s | 3.0828 KOps/s | |
test_unlock_stack_nested | 0.3510ms | 0.2847ms | 3.5121 KOps/s | 3.4159 KOps/s | |
test_flatten_speed | 0.1427ms | 75.7360μs | 13.2038 KOps/s | 13.0860 KOps/s | |
test_unflatten_speed | 0.3857ms | 0.3211ms | 3.1146 KOps/s | 3.1470 KOps/s | |
test_common_ops | 1.7208ms | 0.6004ms | 1.6655 KOps/s | 1.5534 KOps/s | |
test_creation | 37.4610μs | 1.7390μs | 575.0508 KOps/s | 568.1263 KOps/s | |
test_creation_empty | 1.3271ms | 6.9320μs | 144.2588 KOps/s | 110.6424 KOps/s | |
test_creation_nested_1 | 40.5600μs | 8.6175μs | 116.0436 KOps/s | 93.0814 KOps/s | |
test_creation_nested_2 | 45.8500μs | 11.3869μs | 87.8204 KOps/s | 74.6889 KOps/s | |
test_clone | 79.2910μs | 11.6558μs | 85.7942 KOps/s | 86.6154 KOps/s | |
test_getitem[int] | 1.0596ms | 11.3207μs | 88.3339 KOps/s | 89.0858 KOps/s | |
test_getitem[slice_int] | 0.1153ms | 22.2727μs | 44.8981 KOps/s | 45.7757 KOps/s | |
test_getitem[range] | 0.1312ms | 39.7102μs | 25.1824 KOps/s | 25.3006 KOps/s | |
test_getitem[tuple] | 0.1103ms | 19.1629μs | 52.1842 KOps/s | 53.5852 KOps/s | |
test_getitem[list] | 0.2310ms | 34.7596μs | 28.7690 KOps/s | 28.5719 KOps/s | |
test_setitem_dim[int] | 28.4600μs | 19.7241μs | 50.6993 KOps/s | 49.1318 KOps/s | |
test_setitem_dim[slice_int] | 61.9510μs | 39.5091μs | 25.3106 KOps/s | 24.8727 KOps/s | |
test_setitem_dim[range] | 87.8610μs | 53.6630μs | 18.6348 KOps/s | 18.1803 KOps/s | |
test_setitem_dim[tuple] | 65.2710μs | 34.5249μs | 28.9646 KOps/s | 29.4994 KOps/s | |
test_setitem | 75.6710μs | 15.3010μs | 65.3551 KOps/s | 61.1852 KOps/s | |
test_set | 89.9010μs | 15.1823μs | 65.8662 KOps/s | 62.6706 KOps/s | |
test_set_shared | 1.8910ms | 0.1516ms | 6.5983 KOps/s | 6.5886 KOps/s | |
test_update | 0.2453ms | 16.9731μs | 58.9166 KOps/s | 52.2546 KOps/s | |
test_update_nested | 91.4710μs | 23.1580μs | 43.1816 KOps/s | 40.0535 KOps/s | |
test_update__nested | 1.1432ms | 27.0644μs | 36.9490 KOps/s | 37.7265 KOps/s | |
test_set_nested | 89.0010μs | 16.2728μs | 61.4521 KOps/s | 57.3014 KOps/s | |
test_set_nested_new | 94.0210μs | 18.4492μs | 54.2030 KOps/s | 50.9368 KOps/s | |
test_select | 88.4810μs | 30.2565μs | 33.0507 KOps/s | 31.3806 KOps/s | |
test_select_nested | 71.5510μs | 44.1152μs | 22.6679 KOps/s | 22.5355 KOps/s | |
test_exclude_nested | 88.5500μs | 63.3344μs | 15.7892 KOps/s | 15.8475 KOps/s | |
test_empty[True] | 0.3462ms | 0.2882ms | 3.4701 KOps/s | 3.4733 KOps/s | |
test_empty[False] | 3.2410μs | 0.8344μs | 1.1985 MOps/s | 1.2060 MOps/s | |
test_to | 88.2410μs | 58.4890μs | 17.0972 KOps/s | 16.7079 KOps/s | |
test_to_nonblocking | 85.2010μs | 49.5970μs | 20.1625 KOps/s | 20.4135 KOps/s | |
test_unbind_speed | 0.2870ms | 0.2449ms | 4.0834 KOps/s | 4.1010 KOps/s | |
test_unbind_speed_stack0 | 0.3038ms | 0.2422ms | 4.1294 KOps/s | 4.0710 KOps/s | |
test_unbind_speed_stack1 | 92.7437ms | 0.6725ms | 1.4869 KOps/s | 1.4650 KOps/s | |
test_split | 93.2377ms | 1.6450ms | 607.8902 Ops/s | 613.1543 Ops/s | |
test_chunk | 95.4767ms | 1.6513ms | 605.5670 Ops/s | 611.2966 Ops/s | |
test_consolidate[False-None] | 95.6935ms | 2.9776ms | 335.8402 Ops/s | 335.5113 Ops/s | |
test_consolidate[default-None] | 2.1654ms | 1.7592ms | 568.4290 Ops/s | 576.0310 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.9233ms | 1.8020ms | 554.9488 Ops/s | 566.9519 Ops/s | |
test_consolidate_njt[False-None] | 6.8773ms | 6.7334ms | 148.5129 Ops/s | 150.9138 Ops/s | |
test_to[False-False-None] | 1.9234ms | 1.7750ms | 563.3651 Ops/s | 568.9246 Ops/s | |
test_to[True-False-None] | 1.6591ms | 1.4143ms | 707.0568 Ops/s | 735.6691 Ops/s | |
test_to[within-False-None] | 4.5085ms | 4.2643ms | 234.5055 Ops/s | 237.1416 Ops/s | |
test_to[True-default-None] | 5.9233ms | 5.4564ms | 183.2723 Ops/s | 185.3888 Ops/s | |
test_to_njt[False-False-None] | 7.4990ms | 7.0761ms | 141.3206 Ops/s | 144.0432 Ops/s | |
test_to_njt[True-False-None] | 5.9018ms | 5.6029ms | 178.4794 Ops/s | 180.9249 Ops/s | |
test_to_njt[within-False-None] | 12.8590ms | 12.4435ms | 80.3634 Ops/s | 82.4939 Ops/s | |
test_creation[device0] | 0.6274ms | 80.5117μs | 12.4206 KOps/s | 12.4550 KOps/s | |
test_creation_from_tensor | 0.7256ms | 83.9303μs | 11.9146 KOps/s | 11.7539 KOps/s | |
test_add_one[memmap_tensor0] | 0.2362ms | 7.3928μs | 135.2664 KOps/s | 138.8046 KOps/s | |
test_contiguous[memmap_tensor0] | 22.0817μs | 0.4201μs | 2.3806 MOps/s | 2.3796 MOps/s | |
test_stack[memmap_tensor0] | 33.9110μs | 4.7827μs | 209.0852 KOps/s | 223.7509 KOps/s | |
test_memmaptd_index | 1.7405ms | 0.2697ms | 3.7074 KOps/s | 3.8574 KOps/s | |
test_memmaptd_index_astensor | 0.9506ms | 0.3328ms | 3.0050 KOps/s | 3.1160 KOps/s | |
test_memmaptd_index_op | 1.0796ms | 0.6094ms | 1.6409 KOps/s | 1.5913 KOps/s | |
test_serialize_model | 0.1313s | 0.1304s | 7.6694 Ops/s | 7.6068 Ops/s | |
test_serialize_model_pickle | 1.3504s | 1.2124s | 0.8248 Ops/s | 0.8187 Ops/s | |
test_serialize_weights | 0.1322s | 0.1302s | 7.6831 Ops/s | 7.6809 Ops/s | |
test_serialize_weights_returnearly | 0.3073s | 54.0323ms | 18.5074 Ops/s | 22.7403 Ops/s | |
test_serialize_weights_pickle | 1.3533s | 1.2108s | 0.8259 Ops/s | 0.8232 Ops/s | |
test_reshape_pytree | 55.2510μs | 23.0505μs | 43.3830 KOps/s | 43.1665 KOps/s | |
test_reshape_td | 57.1210μs | 27.4685μs | 36.4054 KOps/s | 34.1811 KOps/s | |
test_view_pytree | 0.4414ms | 22.6904μs | 44.0715 KOps/s | 44.0984 KOps/s | |
test_view_td | 80.9410μs | 30.2264μs | 33.0836 KOps/s | 30.1822 KOps/s | |
test_unbind_pytree | 61.3810μs | 29.0855μs | 34.3814 KOps/s | 34.7172 KOps/s | |
test_unbind_td | 0.7717ms | 37.6709μs | 26.5457 KOps/s | 26.7091 KOps/s | |
test_split_pytree | 63.4600μs | 30.6313μs | 32.6463 KOps/s | 32.3334 KOps/s | |
test_split_td | 0.9525ms | 40.6013μs | 24.6298 KOps/s | 24.9644 KOps/s | |
test_add_pytree | 73.7310μs | 36.8867μs | 27.1100 KOps/s | 27.8667 KOps/s | |
test_add_td | 0.4700ms | 49.4831μs | 20.2089 KOps/s | 19.6854 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1821ms | 0.1216ms | 8.2259 KOps/s | 8.0482 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2256ms | 0.1303ms | 7.6774 KOps/s | 7.6204 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1324ms | 96.7393μs | 10.3371 KOps/s | 10.3396 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.5554ms | 0.1536ms | 6.5090 KOps/s | 6.5557 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 48.6800μs | 23.1231μs | 43.2469 KOps/s | 43.7455 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.4462ms | 29.5046μs | 33.8930 KOps/s | 33.7529 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4573ms | 64.7393μs | 15.4466 KOps/s | 15.2059 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.4310ms | 49.1680μs | 20.3384 KOps/s | 20.3473 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1960ms | 0.1435ms | 6.9689 KOps/s | 6.9541 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.6447ms | 0.2165ms | 4.6185 KOps/s | 4.6145 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.5985ms | 99.1273μs | 10.0880 KOps/s | 10.0752 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4470ms | 54.5602μs | 18.3284 KOps/s | 18.0567 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1764ms | 0.1374ms | 7.2775 KOps/s | 7.3353 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6238ms | 0.4979ms | 2.0083 KOps/s | 2.0287 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.6887ms | 0.2593ms | 3.8559 KOps/s | 3.8697 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1840ms | 0.1446ms | 6.9140 KOps/s | 6.9673 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.5027ms | 65.0252μs | 15.3787 KOps/s | 15.4380 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1458ms | 0.1003ms | 9.9736 KOps/s | 10.0199 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.8156ms | 0.4142ms | 2.4143 KOps/s | 2.4455 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1768ms | 0.1389ms | 7.1991 KOps/s | 7.3799 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 61.2100μs | 19.6025μs | 51.0138 KOps/s | 50.8451 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.4221ms | 31.7821μs | 31.4642 KOps/s | 32.1228 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.4746ms | 69.8544μs | 14.3155 KOps/s | 14.1834 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.4731ms | 51.1286μs | 19.5585 KOps/s | 19.3110 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6609ms | 0.3983ms | 2.5109 KOps/s | 2.1964 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.1113ms | 2.7179ms | 367.9368 Ops/s | 373.8932 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.6287ms | 0.3906ms | 2.5605 KOps/s | 2.1618 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.8601ms | 2.7700ms | 361.0144 Ops/s | 360.2360 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.5858ms | 0.1213ms | 8.2427 KOps/s | 8.2702 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5754ms | 83.3100μs | 12.0034 KOps/s | 11.7141 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.4799ms | 0.1140ms | 8.7750 KOps/s | 9.2974 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1853ms | 70.5735μs | 14.1696 KOps/s | 13.6296 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2200ms | 0.1167ms | 8.5686 KOps/s | 9.2118 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1291ms | 74.3433μs | 13.4511 KOps/s | 14.1770 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1491ms | 0.1026ms | 9.7438 KOps/s | 9.6060 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1421ms | 18.1762μs | 55.0170 KOps/s | 55.7583 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1834ms | 98.9289μs | 10.1083 KOps/s | 9.8647 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 44.2600μs | 16.3775μs | 61.0592 KOps/s | 60.9724 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1616ms | 0.1028ms | 9.7268 KOps/s | 9.8214 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 54.3610μs | 16.3570μs | 61.1359 KOps/s | 61.6564 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1506ms | 0.1068ms | 9.3649 KOps/s | 9.3568 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5918ms | 17.8918μs | 55.8914 KOps/s | 57.3958 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1420ms | 98.2904μs | 10.1739 KOps/s | 9.7930 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 44.3400μs | 16.6807μs | 59.9496 KOps/s | 62.2486 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2107ms | 99.5206μs | 10.0482 KOps/s | 9.8450 KOps/s | |
test_compile_indexing[int-pytree-eager] | 57.7510μs | 16.3054μs | 61.3293 KOps/s | 61.3078 KOps/s | |
test_mod_add[eager] | 79.1510μs | 38.4803μs | 25.9873 KOps/s | 25.5507 KOps/s | |
test_mod_add[compile] | 0.3524ms | 80.9162μs | 12.3585 KOps/s | 12.2329 KOps/s | |
test_mod_add[compile-overhead] | 0.3301ms | 0.1689ms | 5.9216 KOps/s | 5.6964 KOps/s | |
test_mod_wrap[eager] | 0.3365ms | 0.2525ms | 3.9602 KOps/s | 3.8726 KOps/s | |
test_mod_wrap[compile] | 0.3789ms | 0.2982ms | 3.3534 KOps/s | 3.2666 KOps/s | |
test_mod_wrap[compile-overhead] | 7.1289ms | 3.7116ms | 269.4254 Ops/s | 269.8335 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5266ms | 1.4108ms | 708.8359 Ops/s | 681.3686 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.3987ms | 1.3077ms | 764.7233 Ops/s | 714.4238 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3882ms | 0.9346ms | 1.0700 KOps/s | 948.7086 Ops/s | |
test_seq_add[eager] | 0.2094ms | 0.1150ms | 8.6964 KOps/s | 8.1673 KOps/s | |
test_seq_add[compile] | 0.1405ms | 90.0658μs | 11.1030 KOps/s | 11.1704 KOps/s | |
test_seq_add[compile-overhead] | 0.1881ms | 0.1305ms | 7.6653 KOps/s | 7.5727 KOps/s | |
test_seq_wrap[eager] | 0.4932ms | 0.4201ms | 2.3803 KOps/s | 2.2842 KOps/s | |
test_seq_wrap[compile] | 0.3787ms | 0.3063ms | 3.2650 KOps/s | 3.1283 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2992ms | 0.2270ms | 4.4056 KOps/s | 4.3973 KOps/s | |
test_func_call_runtime[False-eager] | 0.9735ms | 0.7792ms | 1.2834 KOps/s | 1.3171 KOps/s | |
test_func_call_runtime[False-compile] | 1.0336ms | 0.7622ms | 1.3120 KOps/s | 1.3157 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4137ms | 0.3666ms | 2.7281 KOps/s | 2.7029 KOps/s | |
test_func_call_runtime[True-eager] | 1.0080ms | 0.9218ms | 1.0848 KOps/s | 1.0838 KOps/s | |
test_func_call_runtime[True-compile] | 0.8719ms | 0.7979ms | 1.2533 KOps/s | 1.2811 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4437ms | 0.3883ms | 2.5752 KOps/s | 2.5659 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8060ms | 0.7475ms | 1.3377 KOps/s | 1.3272 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8339ms | 0.7655ms | 1.3063 KOps/s | 1.3081 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4189ms | 0.3700ms | 2.7027 KOps/s | 2.6860 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1269ms | 1.0261ms | 974.5389 Ops/s | 972.0207 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.8762ms | 0.8119ms | 1.2317 KOps/s | 1.2304 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4649ms | 0.4132ms | 2.4202 KOps/s | 2.3947 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5678ms | 2.1054ms | 474.9709 Ops/s | 471.8077 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9774ms | 0.8258ms | 1.2109 KOps/s | 1.2117 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5022ms | 0.4193ms | 2.3848 KOps/s | 2.3862 KOps/s | |
test_distributed | 5.3899ms | 0.1816ms | 5.5069 KOps/s | 8.6124 KOps/s | |
test_tdmodule | 55.0500μs | 18.7521μs | 53.3273 KOps/s | 50.2035 KOps/s | |
test_tdmodule_dispatch | 82.1810μs | 33.9557μs | 29.4502 KOps/s | 27.5687 KOps/s | |
test_tdseq | 39.3500μs | 19.9130μs | 50.2185 KOps/s | 49.7618 KOps/s | |
test_tdseq_dispatch | 58.2300μs | 37.0736μs | 26.9734 KOps/s | 26.0288 KOps/s | |
test_instantiation_functorch | 2.4498ms | 1.6139ms | 619.6347 Ops/s | 628.3074 Ops/s | |
test_exec_functorch | 0.2332ms | 0.1507ms | 6.6339 KOps/s | 6.7353 KOps/s | |
test_exec_functional_call | 0.5649ms | 0.1439ms | 6.9494 KOps/s | 6.9883 KOps/s | |
test_exec_td_decorator | 0.5886ms | 0.1924ms | 5.1968 KOps/s | 5.2271 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.1244ms | 0.6922ms | 1.4447 KOps/s | 1.4400 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1124ms | 0.6891ms | 1.4512 KOps/s | 1.4356 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 1.0441ms | 0.6058ms | 1.6507 KOps/s | 1.6540 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.0179ms | 0.6053ms | 1.6520 KOps/s | 1.6531 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.9994ms | 19.4714ms | 51.3575 Ops/s | 51.5289 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.8563ms | 19.4660ms | 51.3717 Ops/s | 51.5031 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.8990ms | 19.3698ms | 51.6268 Ops/s | 52.0121 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.7196ms | 19.3270ms | 51.7411 Ops/s | 51.7585 Ops/s | |
test_to_module_speed[True] | 1.0605ms | 0.9802ms | 1.0202 KOps/s | 1.0244 KOps/s | |
test_to_module_speed[False] | 1.5015ms | 0.9463ms | 1.0567 KOps/s | 1.0386 KOps/s | |
test_tc_init | 68.8510μs | 37.5688μs | 26.6179 KOps/s | 26.0931 KOps/s | |
test_tc_init_nested | 0.4632ms | 75.9536μs | 13.1659 KOps/s | 13.2634 KOps/s | |
test_tc_first_layer_tensor | 23.9900μs | 0.7958μs | 1.2566 MOps/s | 1.4241 MOps/s | |
test_tc_first_layer_nontensor | 26.6500μs | 2.3304μs | 429.1041 KOps/s | 423.6641 KOps/s | |
test_tc_second_layer_tensor | 0.3938ms | 1.5267μs | 654.9986 KOps/s | 704.1222 KOps/s | |
test_tc_second_layer_nontensor | 25.5300μs | 3.0745μs | 325.2565 KOps/s | 329.0563 KOps/s | |
test_unbind | 0.2291s | 10.2543ms | 97.5201 Ops/s | 142.8736 Ops/s | |
test_full_like | 9.3780ms | 9.1231ms | 109.6113 Ops/s | 107.8687 Ops/s | |
test_zeros_like | 4.9502ms | 4.3172ms | 231.6324 Ops/s | 235.7975 Ops/s | |
test_ones_like | 4.9357ms | 4.3248ms | 231.2221 Ops/s | 235.9793 Ops/s | |
test_clone | 6.7546ms | 6.3860ms | 156.5915 Ops/s | 109.5051 Ops/s | |
test_squeeze | 62.2000μs | 9.8514μs | 101.5080 KOps/s | 105.5551 KOps/s | |
test_unsqueeze | 0.5257ms | 76.7039μs | 13.0372 KOps/s | 13.6576 KOps/s | |
test_split | 0.2735ms | 0.1640ms | 6.0976 KOps/s | 5.9747 KOps/s | |
test_permute | 0.6248ms | 0.1887ms | 5.3000 KOps/s | 5.3915 KOps/s | |
test_stack | 51.1787ms | 50.5457ms | 19.7841 Ops/s | 19.7778 Ops/s | |
test_cat | 53.0448ms | 50.8939ms | 19.6487 Ops/s | 19.7066 Ops/s |
vmoens
added a commit
that referenced
this pull request
Dec 18, 2024
ghstack-source-id: a8aed1eaefe066dafaa974f5b96190860de2f8f1 Pull Request resolved: #1142
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):