-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Faster clone #1040
Open
vmoens
wants to merge
5
commits into
gh/vmoens/29/base
Choose a base branch
from
gh/vmoens/29/head
base: gh/vmoens/29/base
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Oct 14, 2024
vmoens
added a commit
that referenced
this pull request
Oct 14, 2024
ghstack-source-id: 6eecbac5e946a4d93d3d6e148e8c18aaa2501b00 Pull Request resolved: #1040
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Oct 14, 2024
vmoens
added a commit
that referenced
this pull request
Oct 14, 2024
ghstack-source-id: 3fec3b6ac36b07dee77ecf1189f79b6d620532e1 Pull Request resolved: #1040
vmoens
added a commit
that referenced
this pull request
Oct 14, 2024
ghstack-source-id: 71833ef65890ab7c068dca2e1ed2fa5363c488ad Pull Request resolved: #1040
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 69.2890μs | 25.7869μs | 38.7793 KOps/s | 38.1941 KOps/s | |
test_plain_set_stack_nested | 0.1289ms | 26.0733μs | 38.3534 KOps/s | 38.4622 KOps/s | |
test_plain_set_nested_inplace | 68.5880μs | 28.1292μs | 35.5502 KOps/s | 34.8843 KOps/s | |
test_plain_set_stack_nested_inplace | 71.5940μs | 28.1707μs | 35.4979 KOps/s | 35.4986 KOps/s | |
test_items | 28.9840μs | 4.1711μs | 239.7425 KOps/s | 238.0274 KOps/s | |
test_items_nested | 0.4657ms | 0.3849ms | 2.5978 KOps/s | 2.6160 KOps/s | |
test_items_nested_locked | 0.4689ms | 0.3848ms | 2.5984 KOps/s | 2.6169 KOps/s | |
test_items_nested_leaf | 0.1420ms | 80.5680μs | 12.4119 KOps/s | 12.1863 KOps/s | |
test_items_stack_nested | 0.7067ms | 0.3953ms | 2.5300 KOps/s | 2.5880 KOps/s | |
test_items_stack_nested_leaf | 0.2818ms | 85.4766μs | 11.6991 KOps/s | 11.8848 KOps/s | |
test_items_stack_nested_locked | 0.6304ms | 0.3856ms | 2.5933 KOps/s | 2.5832 KOps/s | |
test_keys | 19.8570μs | 3.5769μs | 279.5696 KOps/s | 291.5474 KOps/s | |
test_keys_nested | 0.2197ms | 0.1321ms | 7.5706 KOps/s | 7.4204 KOps/s | |
test_keys_nested_locked | 0.7327ms | 0.1373ms | 7.2809 KOps/s | 7.1266 KOps/s | |
test_keys_nested_leaf | 0.2174ms | 0.1160ms | 8.6193 KOps/s | 8.5042 KOps/s | |
test_keys_stack_nested | 0.2245ms | 0.1310ms | 7.6328 KOps/s | 7.6033 KOps/s | |
test_keys_stack_nested_leaf | 0.3499ms | 0.1145ms | 8.7316 KOps/s | 8.6760 KOps/s | |
test_keys_stack_nested_locked | 0.2644ms | 0.1362ms | 7.3407 KOps/s | 7.2505 KOps/s | |
test_values | 10.7280μs | 1.0517μs | 950.8277 KOps/s | 954.4168 KOps/s | |
test_values_nested | 0.1690ms | 93.7003μs | 10.6723 KOps/s | 10.5723 KOps/s | |
test_values_nested_locked | 0.1532ms | 93.3841μs | 10.7085 KOps/s | 10.4487 KOps/s | |
test_values_nested_leaf | 0.1484ms | 77.2129μs | 12.9512 KOps/s | 12.5887 KOps/s | |
test_values_stack_nested | 0.3070ms | 94.2557μs | 10.6094 KOps/s | 10.3395 KOps/s | |
test_values_stack_nested_leaf | 0.1719ms | 77.8732μs | 12.8414 KOps/s | 12.6447 KOps/s | |
test_values_stack_nested_locked | 0.1680ms | 94.2609μs | 10.6089 KOps/s | 10.4893 KOps/s | |
test_membership | 6.2717μs | 0.7165μs | 1.3958 MOps/s | 1.4127 MOps/s | |
test_membership_nested | 41.2570μs | 2.7172μs | 368.0244 KOps/s | 361.0812 KOps/s | |
test_membership_nested_leaf | 98.2630μs | 2.7446μs | 364.3501 KOps/s | 358.2951 KOps/s | |
test_membership_stacked_nested | 30.9580μs | 2.6842μs | 372.5510 KOps/s | 366.1976 KOps/s | |
test_membership_stacked_nested_leaf | 43.7920μs | 2.7060μs | 369.5501 KOps/s | 357.7445 KOps/s | |
test_membership_nested_last | 35.0050μs | 4.1628μs | 240.2215 KOps/s | 239.2007 KOps/s | |
test_membership_nested_leaf_last | 48.1900μs | 4.1683μs | 239.9056 KOps/s | 238.1114 KOps/s | |
test_membership_stacked_nested_last | 31.4690μs | 5.0167μs | 199.3334 KOps/s | 76.2548 KOps/s | |
test_membership_stacked_nested_leaf_last | 45.8460μs | 5.0605μs | 197.6078 KOps/s | 75.8753 KOps/s | |
test_nested_getleaf | 61.1240μs | 10.9011μs | 91.7342 KOps/s | 93.7342 KOps/s | |
test_nested_get | 54.8830μs | 10.5153μs | 95.0999 KOps/s | 98.1153 KOps/s | |
test_stacked_getleaf | 38.1410μs | 10.8283μs | 92.3506 KOps/s | 93.8886 KOps/s | |
test_stacked_get | 35.7870μs | 10.4399μs | 95.7860 KOps/s | 101.4237 KOps/s | |
test_nested_getitemleaf | 42.0390μs | 11.4129μs | 87.6202 KOps/s | 88.8781 KOps/s | |
test_nested_getitem | 42.7500μs | 10.6744μs | 93.6817 KOps/s | 95.3144 KOps/s | |
test_stacked_getitemleaf | 37.3290μs | 11.2793μs | 88.6584 KOps/s | 89.9029 KOps/s | |
test_stacked_getitem | 52.3780μs | 10.6005μs | 94.3355 KOps/s | 96.3238 KOps/s | |
test_lock_nested | 88.8736ms | 0.5960ms | 1.6778 KOps/s | 1.9946 KOps/s | |
test_lock_stack_nested | 0.6997ms | 0.4659ms | 2.1463 KOps/s | 2.2014 KOps/s | |
test_unlock_nested | 0.1094s | 0.5357ms | 1.8666 KOps/s | 2.3629 KOps/s | |
test_unlock_stack_nested | 0.6426ms | 0.3800ms | 2.6316 KOps/s | 2.6914 KOps/s | |
test_flatten_speed | 0.2159ms | 0.1008ms | 9.9226 KOps/s | 9.8008 KOps/s | |
test_unflatten_speed | 0.8974ms | 0.5260ms | 1.9011 KOps/s | 1.8961 KOps/s | |
test_common_ops | 4.5001ms | 1.1707ms | 854.2225 Ops/s | 822.3672 Ops/s | |
test_creation | 35.7860μs | 2.2469μs | 445.0538 KOps/s | 467.4938 KOps/s | |
test_creation_empty | 83.8170μs | 20.7442μs | 48.2062 KOps/s | 48.4069 KOps/s | |
test_creation_nested_1 | 67.4370μs | 23.6103μs | 42.3544 KOps/s | 40.9332 KOps/s | |
test_creation_nested_2 | 81.9930μs | 27.7048μs | 36.0948 KOps/s | 34.5873 KOps/s | |
test_clone | 59.4710μs | 16.9931μs | 58.8474 KOps/s | 56.4298 KOps/s | |
test_getitem[int] | 0.9929ms | 16.6607μs | 60.0215 KOps/s | 59.2432 KOps/s | |
test_getitem[slice_int] | 0.1655ms | 30.3908μs | 32.9047 KOps/s | 31.2526 KOps/s | |
test_getitem[range] | 0.6282ms | 57.6739μs | 17.3389 KOps/s | 17.2275 KOps/s | |
test_getitem[tuple] | 0.1495ms | 24.8673μs | 40.2134 KOps/s | 39.8748 KOps/s | |
test_getitem[list] | 0.6607ms | 52.6619μs | 18.9891 KOps/s | 18.9487 KOps/s | |
test_setitem_dim[int] | 72.0350μs | 32.6555μs | 30.6227 KOps/s | 30.6579 KOps/s | |
test_setitem_dim[slice_int] | 0.1131ms | 60.0821μs | 16.6439 KOps/s | 16.4877 KOps/s | |
test_setitem_dim[range] | 0.1579ms | 83.3302μs | 12.0005 KOps/s | 11.7073 KOps/s | |
test_setitem_dim[tuple] | 0.1020ms | 48.4645μs | 20.6337 KOps/s | 20.4205 KOps/s | |
test_setitem | 0.2896ms | 30.6741μs | 32.6008 KOps/s | 31.0880 KOps/s | |
test_set | 90.1580μs | 30.0167μs | 33.3147 KOps/s | 31.1699 KOps/s | |
test_set_shared | 3.2885ms | 0.2183ms | 4.5807 KOps/s | 4.5827 KOps/s | |
test_update | 0.1517ms | 39.5963μs | 25.2549 KOps/s | 24.0129 KOps/s | |
test_update_nested | 0.1871ms | 50.5647μs | 19.7766 KOps/s | 18.9301 KOps/s | |
test_update__nested | 0.7191ms | 45.5917μs | 21.9338 KOps/s | 22.3579 KOps/s | |
test_set_nested | 0.3237ms | 33.3098μs | 30.0212 KOps/s | 29.4487 KOps/s | |
test_set_nested_new | 0.3274ms | 38.5260μs | 25.9565 KOps/s | 25.1328 KOps/s | |
test_select | 0.1219ms | 57.0016μs | 17.5434 KOps/s | 17.0384 KOps/s | |
test_select_nested | 0.1147ms | 59.8925μs | 16.6966 KOps/s | 16.4894 KOps/s | |
test_exclude_nested | 0.6498ms | 74.8501μs | 13.3600 KOps/s | 13.0447 KOps/s | |
test_empty[True] | 0.5408ms | 0.3504ms | 2.8542 KOps/s | 2.8434 KOps/s | |
test_empty[False] | 18.4375μs | 1.2484μs | 801.0289 KOps/s | 827.1624 KOps/s | |
test_unbind_speed | 0.3839ms | 0.2970ms | 3.3675 KOps/s | 3.3754 KOps/s | |
test_unbind_speed_stack0 | 0.4519ms | 0.2898ms | 3.4504 KOps/s | 3.4838 KOps/s | |
test_unbind_speed_stack1 | 0.1125s | 0.9303ms | 1.0749 KOps/s | 1.3644 KOps/s | |
test_split | 3.2813ms | 2.0000ms | 500.0072 Ops/s | 448.0534 Ops/s | |
test_chunk | 0.1025s | 2.2056ms | 453.3918 Ops/s | 445.4885 Ops/s | |
test_creation[device0] | 0.2040ms | 0.1151ms | 8.6885 KOps/s | 8.5011 KOps/s | |
test_creation_from_tensor | 3.2607ms | 0.1164ms | 8.5903 KOps/s | 8.4540 KOps/s | |
test_add_one[memmap_tensor0] | 0.4741ms | 7.0735μs | 141.3727 KOps/s | 129.4769 KOps/s | |
test_contiguous[memmap_tensor0] | 24.0540μs | 1.8974μs | 527.0373 KOps/s | 518.9374 KOps/s | |
test_stack[memmap_tensor0] | 83.9170μs | 5.6418μs | 177.2483 KOps/s | 174.6839 KOps/s | |
test_memmaptd_index | 1.2059ms | 0.4135ms | 2.4182 KOps/s | 2.4245 KOps/s | |
test_memmaptd_index_astensor | 0.9889ms | 0.5085ms | 1.9664 KOps/s | 1.9360 KOps/s | |
test_memmaptd_index_op | 1.7647ms | 1.0622ms | 941.4631 Ops/s | 920.8502 Ops/s | |
test_serialize_model | 0.1351s | 0.1248s | 8.0135 Ops/s | 8.5637 Ops/s | |
test_serialize_model_pickle | 0.4324s | 0.3929s | 2.5450 Ops/s | 2.5104 Ops/s | |
test_serialize_weights | 0.1318s | 0.1205s | 8.2978 Ops/s | 8.0192 Ops/s | |
test_serialize_weights_returnearly | 0.2567s | 0.1750s | 5.7146 Ops/s | 6.1465 Ops/s | |
test_serialize_weights_pickle | 0.4554s | 0.4055s | 2.4661 Ops/s | 2.4289 Ops/s | |
test_serialize_weights_filesystem | 0.1538s | 0.1416s | 7.0634 Ops/s | 7.1450 Ops/s | |
test_serialize_model_filesystem | 0.1581s | 0.1510s | 6.6225 Ops/s | 6.5821 Ops/s | |
test_reshape_pytree | 83.8070μs | 38.0874μs | 26.2554 KOps/s | 25.7606 KOps/s | |
test_reshape_td | 91.0600μs | 45.1273μs | 22.1595 KOps/s | 21.0053 KOps/s | |
test_view_pytree | 0.1006ms | 38.3607μs | 26.0683 KOps/s | 25.9838 KOps/s | |
test_view_td | 0.1060ms | 51.8040μs | 19.3035 KOps/s | 19.0150 KOps/s | |
test_unbind_pytree | 75.7920μs | 35.4909μs | 28.1762 KOps/s | 28.0567 KOps/s | |
test_unbind_td | 0.3668ms | 45.1200μs | 22.1631 KOps/s | 22.4875 KOps/s | |
test_split_pytree | 85.5400μs | 37.2904μs | 26.8166 KOps/s | 26.6019 KOps/s | |
test_split_td | 0.5266ms | 57.5794μs | 17.3673 KOps/s | 17.3779 KOps/s | |
test_add_pytree | 96.0190μs | 43.6047μs | 22.9333 KOps/s | 22.2177 KOps/s | |
test_add_td | 0.1695ms | 85.8646μs | 11.6462 KOps/s | 10.8455 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1290ms | 58.6062μs | 17.0630 KOps/s | 17.1953 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3960ms | 0.1980ms | 5.0516 KOps/s | 5.0462 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1286ms | 57.1068μs | 17.5110 KOps/s | 17.6727 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3144ms | 0.1379ms | 7.2503 KOps/s | 7.1254 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 71.3940μs | 23.2920μs | 42.9332 KOps/s | 44.2836 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1478ms | 74.6866μs | 13.3893 KOps/s | 13.0550 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1349ms | 75.2887μs | 13.2822 KOps/s | 13.0446 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1358ms | 68.3377μs | 14.6332 KOps/s | 14.0820 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3044ms | 0.1818ms | 5.4993 KOps/s | 5.4429 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 1.4672ms | 0.2422ms | 4.1287 KOps/s | 4.0823 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1117ms | 49.3149μs | 20.2779 KOps/s | 20.8496 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4650ms | 79.6544μs | 12.5542 KOps/s | 11.8601 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2799ms | 0.1744ms | 5.7324 KOps/s | 5.6395 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5176ms | 0.2821ms | 3.5449 KOps/s | 3.4597 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4822ms | 0.2753ms | 3.6324 KOps/s | 3.6002 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.5089ms | 0.1872ms | 5.3427 KOps/s | 5.4754 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2020ms | 75.3398μs | 13.2732 KOps/s | 13.1895 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1388ms | 48.5738μs | 20.5872 KOps/s | 20.6688 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.3808ms | 0.2274ms | 4.3983 KOps/s | 4.3105 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3794ms | 0.1715ms | 5.8296 KOps/s | 5.6198 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2446ms | 0.1092ms | 9.1608 KOps/s | 9.0070 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1738ms | 78.5355μs | 12.7331 KOps/s | 12.4012 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1378ms | 76.0635μs | 13.1469 KOps/s | 12.8520 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1681ms | 66.8078μs | 14.9683 KOps/s | 14.1643 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2714ms | 0.1927ms | 5.1905 KOps/s | 5.2148 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.8873ms | 1.6751ms | 596.9635 Ops/s | 571.9744 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2892ms | 0.1916ms | 5.2182 KOps/s | 5.1871 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.3372ms | 1.0770ms | 928.4779 Ops/s | 904.9282 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.5232ms | 0.4143ms | 2.4136 KOps/s | 2.2727 KOps/s | |
test_compile_assign_and_add_stack[eager] | 5.1599ms | 4.1068ms | 243.4977 Ops/s | 234.1584 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 78.6870μs | 33.5657μs | 29.7923 KOps/s | 28.8566 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.7245ms | 48.2340μs | 20.7323 KOps/s | 20.2371 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 84.8280μs | 29.2525μs | 34.1851 KOps/s | 32.9953 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 83.6360μs | 27.9640μs | 35.7603 KOps/s | 33.6872 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 87.9840μs | 29.3508μs | 34.0706 KOps/s | 32.9735 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1415ms | 28.0931μs | 35.5959 KOps/s | 34.1323 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1400ms | 71.7819μs | 13.9311 KOps/s | 13.5626 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5580ms | 27.1962μs | 36.7699 KOps/s | 36.1797 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1233ms | 67.9277μs | 14.7215 KOps/s | 14.6971 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 66.0930μs | 23.1179μs | 43.2566 KOps/s | 43.1661 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1355ms | 68.2881μs | 14.6438 KOps/s | 14.5204 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.1816ms | 23.1755μs | 43.1491 KOps/s | 42.9198 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1372ms | 71.9380μs | 13.9009 KOps/s | 13.7729 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.7859ms | 27.0446μs | 36.9759 KOps/s | 36.7677 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1920ms | 67.9597μs | 14.7146 KOps/s | 14.6084 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 92.5050μs | 22.5085μs | 44.4276 KOps/s | 43.6118 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1443ms | 67.3176μs | 14.8550 KOps/s | 14.6330 KOps/s | |
test_compile_indexing[int-pytree-eager] | 56.3350μs | 22.6959μs | 44.0609 KOps/s | 43.1405 KOps/s | |
test_mod_add[eager] | 70.6820μs | 26.7297μs | 37.4116 KOps/s | 35.9577 KOps/s | |
test_mod_add[compile] | 85.1690μs | 37.6413μs | 26.5665 KOps/s | 26.0622 KOps/s | |
test_mod_add[compile-overhead] | 96.7810μs | 37.6755μs | 26.5425 KOps/s | 26.6547 KOps/s | |
test_mod_wrap[eager] | 0.3179ms | 0.2081ms | 4.8062 KOps/s | 4.7678 KOps/s | |
test_mod_wrap[compile] | 0.3563ms | 0.2306ms | 4.3368 KOps/s | 4.3110 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3827ms | 0.2296ms | 4.3552 KOps/s | 4.2691 KOps/s | |
test_mod_wrap_and_backward[eager] | 18.3251ms | 11.6997ms | 85.4722 Ops/s | 82.4218 Ops/s | |
test_mod_wrap_and_backward[compile] | 13.7414ms | 11.6520ms | 85.8219 Ops/s | 84.7389 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 13.2989ms | 11.8545ms | 84.3560 Ops/s | 75.0334 Ops/s | |
test_seq_add[eager] | 0.2197ms | 94.0548μs | 10.6321 KOps/s | 10.1531 KOps/s | |
test_seq_add[compile] | 0.1195ms | 63.8330μs | 15.6659 KOps/s | 15.6793 KOps/s | |
test_seq_add[compile-overhead] | 0.1209ms | 63.5418μs | 15.7377 KOps/s | 15.7536 KOps/s | |
test_seq_wrap[eager] | 0.7301ms | 0.3915ms | 2.5541 KOps/s | 2.4466 KOps/s | |
test_seq_wrap[compile] | 0.4909ms | 0.2710ms | 3.6894 KOps/s | 3.6267 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3662ms | 0.2676ms | 3.7369 KOps/s | 3.5771 KOps/s | |
test_func_call_runtime[False-eager] | 0.6723ms | 0.5175ms | 1.9322 KOps/s | 1.8526 KOps/s | |
test_func_call_runtime[False-compile] | 0.6339ms | 0.4960ms | 2.0159 KOps/s | 1.9879 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 1.0460ms | 0.4986ms | 2.0057 KOps/s | 1.9705 KOps/s | |
test_func_call_runtime[True-eager] | 0.9430ms | 0.7296ms | 1.3705 KOps/s | 1.3189 KOps/s | |
test_func_call_runtime[True-compile] | 0.6001ms | 0.5039ms | 1.9846 KOps/s | 1.9034 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.7738ms | 0.5063ms | 1.9752 KOps/s | 1.9179 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7254ms | 0.5149ms | 1.9422 KOps/s | 1.9017 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.9302ms | 0.4958ms | 2.0169 KOps/s | 1.9779 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.9188ms | 0.4955ms | 2.0182 KOps/s | 1.9753 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.2164ms | 0.8741ms | 1.1441 KOps/s | 1.0963 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.1599ms | 0.7291ms | 1.3715 KOps/s | 1.3197 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.1813ms | 0.7312ms | 1.3677 KOps/s | 1.3255 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.3936ms | 1.8723ms | 534.0915 Ops/s | 512.1015 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 3.4336ms | 1.9577ms | 510.7938 Ops/s | 499.6178 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 3.3404ms | 1.9524ms | 512.1930 Ops/s | 492.1656 Ops/s | |
test_distributed | 0.3029ms | 0.1266ms | 7.9017 KOps/s | 7.5913 KOps/s | |
test_tdmodule | 98.5340μs | 19.2525μs | 51.9414 KOps/s | 48.6060 KOps/s | |
test_tdmodule_dispatch | 65.8230μs | 38.4040μs | 26.0389 KOps/s | 25.1138 KOps/s | |
test_tdseq | 49.0020μs | 22.3377μs | 44.7673 KOps/s | 43.3337 KOps/s | |
test_tdseq_dispatch | 74.6900μs | 44.3183μs | 22.5640 KOps/s | 21.9582 KOps/s | |
test_instantiation_functorch | 2.1361ms | 1.5350ms | 651.4693 Ops/s | 582.6844 Ops/s | |
test_exec_functorch | 0.3295ms | 0.1824ms | 5.4823 KOps/s | 5.3086 KOps/s | |
test_exec_functional_call | 0.3120ms | 0.1716ms | 5.8288 KOps/s | 5.8008 KOps/s | |
test_exec_td_decorator | 0.4782ms | 0.2328ms | 4.2962 KOps/s | 4.2328 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.0297ms | 0.6635ms | 1.5070 KOps/s | 1.4996 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8457ms | 0.6333ms | 1.5791 KOps/s | 1.5058 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8092ms | 0.5194ms | 1.9253 KOps/s | 1.8408 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7022ms | 0.5158ms | 1.9389 KOps/s | 1.7222 KOps/s | |
test_to_module_speed[True] | 2.2988ms | 1.4231ms | 702.6701 Ops/s | 685.8409 Ops/s | |
test_to_module_speed[False] | 1.7128ms | 1.3816ms | 723.8200 Ops/s | 714.2990 Ops/s | |
test_tc_init | 0.1249ms | 47.4474μs | 21.0760 KOps/s | 20.4833 KOps/s | |
test_tc_init_nested | 0.2054ms | 94.8297μs | 10.5452 KOps/s | 10.2663 KOps/s | |
test_tc_first_layer_tensor | 24.0050μs | 1.6030μs | 623.8207 KOps/s | 647.5391 KOps/s | |
test_tc_first_layer_nontensor | 24.2950μs | 4.7150μs | 212.0886 KOps/s | 201.3532 KOps/s | |
test_tc_second_layer_tensor | 28.6940μs | 2.9248μs | 341.9068 KOps/s | 340.9035 KOps/s | |
test_tc_second_layer_nontensor | 44.5550μs | 5.9719μs | 167.4504 KOps/s | 162.6014 KOps/s | |
test_unbind | 0.5180s | 13.6787ms | 73.1064 Ops/s | 132.9157 Ops/s | |
test_full_like | 13.8388ms | 8.6218ms | 115.9851 Ops/s | 74.1177 Ops/s | |
test_zeros_like | 4.8817ms | 3.0767ms | 325.0257 Ops/s | 118.9574 Ops/s | |
test_ones_like | 12.6680ms | 6.3519ms | 157.4337 Ops/s | 111.7366 Ops/s | |
test_clone | 16.2253ms | 9.0088ms | 111.0024 Ops/s | 90.1455 Ops/s | |
test_squeeze | 74.0480μs | 12.3087μs | 81.2432 KOps/s | 81.8361 KOps/s | |
test_unsqueeze | 0.1681ms | 91.5289μs | 10.9255 KOps/s | 11.1825 KOps/s | |
test_split | 0.5531ms | 0.1927ms | 5.1889 KOps/s | 5.2204 KOps/s | |
test_permute | 0.4814ms | 0.2144ms | 4.6631 KOps/s | 4.6142 KOps/s | |
test_stack | 35.1199ms | 27.2849ms | 36.6503 Ops/s | 36.4029 Ops/s | |
test_cat | 31.0045ms | 26.8414ms | 37.2559 Ops/s | 36.7647 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1481ms | 18.0352μs | 55.4471 KOps/s | 56.0120 KOps/s | |
test_plain_set_stack_nested | 38.5210μs | 17.9399μs | 55.7418 KOps/s | 56.2071 KOps/s | |
test_plain_set_nested_inplace | 47.0510μs | 19.2175μs | 52.0360 KOps/s | 52.2696 KOps/s | |
test_plain_set_stack_nested_inplace | 61.6610μs | 19.0721μs | 52.4326 KOps/s | 52.5799 KOps/s | |
test_items | 22.5700μs | 2.8824μs | 346.9361 KOps/s | 345.2369 KOps/s | |
test_items_nested | 0.4969ms | 0.3374ms | 2.9643 KOps/s | 2.9884 KOps/s | |
test_items_nested_locked | 0.3675ms | 0.3396ms | 2.9445 KOps/s | 2.9785 KOps/s | |
test_items_nested_leaf | 85.9520μs | 62.9998μs | 15.8731 KOps/s | 16.0185 KOps/s | |
test_items_stack_nested | 0.3721ms | 0.3411ms | 2.9317 KOps/s | 2.9422 KOps/s | |
test_items_stack_nested_leaf | 86.6210μs | 63.5325μs | 15.7400 KOps/s | 15.7040 KOps/s | |
test_items_stack_nested_locked | 0.4368ms | 0.3415ms | 2.9284 KOps/s | 2.9613 KOps/s | |
test_keys | 30.8000μs | 3.4327μs | 291.3153 KOps/s | 289.5678 KOps/s | |
test_keys_nested | 0.1229ms | 71.2072μs | 14.0435 KOps/s | 14.1398 KOps/s | |
test_keys_nested_locked | 0.7876ms | 76.5246μs | 13.0677 KOps/s | 13.0194 KOps/s | |
test_keys_nested_leaf | 97.2320μs | 61.6266μs | 16.2268 KOps/s | 16.3216 KOps/s | |
test_keys_stack_nested | 0.1111ms | 71.1836μs | 14.0482 KOps/s | 14.1016 KOps/s | |
test_keys_stack_nested_leaf | 95.1820μs | 62.9837μs | 15.8771 KOps/s | 15.8853 KOps/s | |
test_keys_stack_nested_locked | 0.1054ms | 77.4681μs | 12.9085 KOps/s | 12.8721 KOps/s | |
test_values | 4.8050μs | 0.8369μs | 1.1949 MOps/s | 1.1830 MOps/s | |
test_values_nested | 74.6510μs | 48.4863μs | 20.6244 KOps/s | 20.5079 KOps/s | |
test_values_nested_locked | 74.0710μs | 50.0881μs | 19.9648 KOps/s | 19.8581 KOps/s | |
test_values_nested_leaf | 67.9920μs | 42.8792μs | 23.3214 KOps/s | 23.4114 KOps/s | |
test_values_stack_nested | 85.1720μs | 49.6950μs | 20.1228 KOps/s | 20.1938 KOps/s | |
test_values_stack_nested_leaf | 71.4820μs | 43.7713μs | 22.8460 KOps/s | 23.0268 KOps/s | |
test_values_stack_nested_locked | 80.2810μs | 51.8904μs | 19.2714 KOps/s | 19.5605 KOps/s | |
test_membership | 1.5356μs | 0.5066μs | 1.9740 MOps/s | 1.9716 MOps/s | |
test_membership_nested | 13.3500μs | 1.8705μs | 534.6106 KOps/s | 548.0333 KOps/s | |
test_membership_nested_leaf | 14.6405μs | 1.8397μs | 543.5701 KOps/s | 557.6154 KOps/s | |
test_membership_stacked_nested | 33.2710μs | 1.8903μs | 529.0027 KOps/s | 534.2995 KOps/s | |
test_membership_stacked_nested_leaf | 25.0100μs | 1.9309μs | 517.9049 KOps/s | 534.7840 KOps/s | |
test_membership_nested_last | 38.7310μs | 2.9437μs | 339.7118 KOps/s | 343.5265 KOps/s | |
test_membership_nested_leaf_last | 36.2610μs | 2.9618μs | 337.6271 KOps/s | 344.6008 KOps/s | |
test_membership_stacked_nested_last | 26.4400μs | 2.9099μs | 343.6511 KOps/s | 346.2904 KOps/s | |
test_membership_stacked_nested_leaf_last | 33.2110μs | 2.9296μs | 341.3441 KOps/s | 344.7707 KOps/s | |
test_nested_getleaf | 32.7800μs | 6.1453μs | 162.7272 KOps/s | 165.5011 KOps/s | |
test_nested_get | 34.0700μs | 5.7354μs | 174.3544 KOps/s | 174.1047 KOps/s | |
test_stacked_getleaf | 32.9110μs | 6.0763μs | 164.5725 KOps/s | 165.3511 KOps/s | |
test_stacked_get | 35.0510μs | 5.6410μs | 177.2726 KOps/s | 176.6379 KOps/s | |
test_nested_getitemleaf | 40.2210μs | 6.1116μs | 163.6242 KOps/s | 164.4866 KOps/s | |
test_nested_getitem | 32.1100μs | 5.7612μs | 173.5752 KOps/s | 174.3752 KOps/s | |
test_stacked_getitemleaf | 31.1600μs | 6.0795μs | 164.4863 KOps/s | 164.1013 KOps/s | |
test_stacked_getitem | 35.9810μs | 5.7058μs | 175.2593 KOps/s | 177.5636 KOps/s | |
test_lock_nested | 7.6557ms | 0.4264ms | 2.3453 KOps/s | 2.3772 KOps/s | |
test_lock_stack_nested | 0.4219ms | 0.3887ms | 2.5726 KOps/s | 2.5879 KOps/s | |
test_unlock_nested | 0.7798ms | 0.3600ms | 2.7778 KOps/s | 2.8043 KOps/s | |
test_unlock_stack_nested | 0.3621ms | 0.3261ms | 3.0668 KOps/s | 3.0869 KOps/s | |
test_flatten_speed | 0.1564ms | 77.7880μs | 12.8555 KOps/s | 13.0576 KOps/s | |
test_unflatten_speed | 0.4111ms | 0.3251ms | 3.0762 KOps/s | 3.1157 KOps/s | |
test_common_ops | 1.6757ms | 1.2923ms | 773.8041 Ops/s | 770.1550 Ops/s | |
test_creation | 29.0010μs | 1.4484μs | 690.4227 KOps/s | 685.5652 KOps/s | |
test_creation_empty | 63.8210μs | 17.8292μs | 56.0879 KOps/s | 55.7126 KOps/s | |
test_creation_nested_1 | 55.2910μs | 19.4460μs | 51.4245 KOps/s | 50.9733 KOps/s | |
test_creation_nested_2 | 50.2110μs | 22.3847μs | 44.6733 KOps/s | 43.9264 KOps/s | |
test_clone | 66.6510μs | 28.7896μs | 34.7348 KOps/s | 34.8615 KOps/s | |
test_getitem[int] | 1.3639ms | 15.6425μs | 63.9286 KOps/s | 66.3906 KOps/s | |
test_getitem[slice_int] | 0.1204ms | 26.6492μs | 37.5245 KOps/s | 38.5328 KOps/s | |
test_getitem[range] | 0.1583ms | 0.1082ms | 9.2463 KOps/s | 9.3098 KOps/s | |
test_getitem[tuple] | 0.1260ms | 23.4704μs | 42.6068 KOps/s | 45.0243 KOps/s | |
test_getitem[list] | 0.1940ms | 98.6454μs | 10.1373 KOps/s | 10.2572 KOps/s | |
test_setitem_dim[int] | 66.5020μs | 43.7993μs | 22.8314 KOps/s | 22.5867 KOps/s | |
test_setitem_dim[slice_int] | 94.2420μs | 66.2949μs | 15.0841 KOps/s | 15.0286 KOps/s | |
test_setitem_dim[range] | 0.1873ms | 0.1265ms | 7.9050 KOps/s | 7.9499 KOps/s | |
test_setitem_dim[tuple] | 90.5620μs | 59.8409μs | 16.7110 KOps/s | 16.8128 KOps/s | |
test_setitem | 74.2120μs | 43.2232μs | 23.1357 KOps/s | 23.2663 KOps/s | |
test_set | 81.1010μs | 41.3635μs | 24.1759 KOps/s | 23.8252 KOps/s | |
test_set_shared | 0.3656ms | 53.2160μs | 18.7913 KOps/s | 18.4497 KOps/s | |
test_update | 94.6120μs | 52.3516μs | 19.1016 KOps/s | 19.2711 KOps/s | |
test_update_nested | 94.5610μs | 59.4605μs | 16.8179 KOps/s | 16.8241 KOps/s | |
test_update__nested | 0.1957ms | 63.5399μs | 15.7381 KOps/s | 14.8851 KOps/s | |
test_set_nested | 94.4020μs | 45.9516μs | 21.7621 KOps/s | 22.2602 KOps/s | |
test_set_nested_new | 92.0520μs | 48.8504μs | 20.4706 KOps/s | 20.9657 KOps/s | |
test_select | 97.9920μs | 61.4597μs | 16.2708 KOps/s | 16.5933 KOps/s | |
test_select_nested | 69.7210μs | 41.5868μs | 24.0461 KOps/s | 23.9394 KOps/s | |
test_exclude_nested | 85.4520μs | 57.4668μs | 17.4014 KOps/s | 17.2400 KOps/s | |
test_empty[True] | 0.2880ms | 0.2612ms | 3.8283 KOps/s | 3.8948 KOps/s | |
test_empty[False] | 6.6261μs | 0.7383μs | 1.3545 MOps/s | 1.3479 MOps/s | |
test_to | 63.8810μs | 26.3096μs | 38.0090 KOps/s | 38.7987 KOps/s | |
test_to_nonblocking | 53.4410μs | 24.9690μs | 40.0496 KOps/s | 41.0036 KOps/s | |
test_unbind_speed | 1.5734ms | 0.2717ms | 3.6811 KOps/s | 3.6459 KOps/s | |
test_unbind_speed_stack0 | 0.3191ms | 0.2743ms | 3.6450 KOps/s | 3.6568 KOps/s | |
test_unbind_speed_stack1 | 93.2680ms | 0.7126ms | 1.4032 KOps/s | 1.3981 KOps/s | |
test_split | 95.0083ms | 2.0875ms | 479.0488 Ops/s | 469.5180 Ops/s | |
test_chunk | 95.2293ms | 2.1098ms | 473.9778 Ops/s | 468.0460 Ops/s | |
test_creation[device0] | 0.3066ms | 0.1264ms | 7.9087 KOps/s | 7.9615 KOps/s | |
test_creation_from_tensor | 0.3511ms | 0.1276ms | 7.8385 KOps/s | 7.8619 KOps/s | |
test_add_one[memmap_tensor0] | 0.1474ms | 8.8321μs | 113.2240 KOps/s | 112.8768 KOps/s | |
test_contiguous[memmap_tensor0] | 29.6900μs | 2.0792μs | 480.9534 KOps/s | 446.1647 KOps/s | |
test_stack[memmap_tensor0] | 30.9610μs | 6.3821μs | 156.6885 KOps/s | 154.5334 KOps/s | |
test_memmaptd_index | 1.1309ms | 0.4248ms | 2.3543 KOps/s | 2.4536 KOps/s | |
test_memmaptd_index_astensor | 0.7840ms | 0.4979ms | 2.0084 KOps/s | 2.0661 KOps/s | |
test_memmaptd_index_op | 1.4813ms | 1.0732ms | 931.8274 Ops/s | 945.6589 Ops/s | |
test_serialize_model | 0.1305s | 0.1297s | 7.7129 Ops/s | 7.6844 Ops/s | |
test_serialize_model_pickle | 1.3774s | 1.2186s | 0.8206 Ops/s | 0.8230 Ops/s | |
test_serialize_weights | 0.2232s | 0.1425s | 7.0177 Ops/s | 6.9837 Ops/s | |
test_serialize_weights_returnearly | 0.2209s | 55.9398ms | 17.8764 Ops/s | 17.7111 Ops/s | |
test_serialize_weights_pickle | 1.3898s | 1.2213s | 0.8188 Ops/s | 0.8213 Ops/s | |
test_reshape_pytree | 71.0610μs | 34.9648μs | 28.6002 KOps/s | 29.0496 KOps/s | |
test_reshape_td | 71.3610μs | 40.5373μs | 24.6686 KOps/s | 24.5762 KOps/s | |
test_view_pytree | 63.5410μs | 34.7931μs | 28.7414 KOps/s | 29.7911 KOps/s | |
test_view_td | 84.0510μs | 46.6653μs | 21.4292 KOps/s | 22.5641 KOps/s | |
test_unbind_pytree | 59.5210μs | 33.6475μs | 29.7199 KOps/s | 29.0042 KOps/s | |
test_unbind_td | 0.6643ms | 42.6029μs | 23.4726 KOps/s | 22.9168 KOps/s | |
test_split_pytree | 75.4220μs | 45.2372μs | 22.1057 KOps/s | 21.5993 KOps/s | |
test_split_td | 0.1748ms | 53.7848μs | 18.5926 KOps/s | 17.6705 KOps/s | |
test_add_pytree | 87.0920μs | 54.9620μs | 18.1944 KOps/s | 16.6733 KOps/s | |
test_add_td | 0.2310ms | 95.8459μs | 10.4334 KOps/s | 9.8267 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.3203ms | 0.1576ms | 6.3437 KOps/s | 6.1657 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2903ms | 0.1603ms | 6.2384 KOps/s | 6.3181 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2143ms | 0.1560ms | 6.4091 KOps/s | 6.3472 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2804ms | 0.1831ms | 5.4605 KOps/s | 5.4447 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 56.1010μs | 21.1438μs | 47.2951 KOps/s | 45.7675 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1056ms | 47.0824μs | 21.2394 KOps/s | 21.2115 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.3112ms | 64.3962μs | 15.5289 KOps/s | 15.4803 KOps/s | |
test_compile_copy_nested[pytree-eager] | 92.3210μs | 49.8423μs | 20.0633 KOps/s | 20.1295 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3645ms | 0.3107ms | 3.2187 KOps/s | 3.1819 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3132ms | 0.2294ms | 4.3584 KOps/s | 4.3423 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1942ms | 0.1256ms | 7.9615 KOps/s | 7.8559 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1262ms | 64.1341μs | 15.5923 KOps/s | 15.5553 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3670ms | 0.3205ms | 3.1201 KOps/s | 3.1406 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.7032ms | 0.6115ms | 1.6354 KOps/s | 1.6192 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4067ms | 0.2796ms | 3.5765 KOps/s | 3.5700 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.4662ms | 0.3149ms | 3.1760 KOps/s | 3.1863 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1650ms | 81.2969μs | 12.3006 KOps/s | 12.5918 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1945ms | 0.1312ms | 7.6191 KOps/s | 7.5804 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6565ms | 0.5365ms | 1.8639 KOps/s | 1.8975 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3847ms | 0.3189ms | 3.1353 KOps/s | 3.1103 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 55.6210μs | 19.1763μs | 52.1478 KOps/s | 52.9529 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 86.9120μs | 38.5554μs | 25.9367 KOps/s | 26.0905 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1147ms | 70.0492μs | 14.2757 KOps/s | 14.3784 KOps/s | |
test_compile_copy_flat[pytree-eager] | 90.5920μs | 51.3442μs | 19.4764 KOps/s | 19.4955 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.3974ms | 0.8367ms | 1.1952 KOps/s | 1.1339 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.6554ms | 3.2978ms | 303.2296 Ops/s | 312.6941 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.4501ms | 0.8409ms | 1.1892 KOps/s | 1.1199 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.4766ms | 3.2177ms | 310.7825 Ops/s | 315.8217 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.2900ms | 0.1201ms | 8.3288 KOps/s | 8.0653 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.2254ms | 60.6529μs | 16.4873 KOps/s | 15.3506 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1707ms | 0.1196ms | 8.3620 KOps/s | 8.3832 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.2107ms | 44.4288μs | 22.5079 KOps/s | 22.0900 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2484ms | 0.1207ms | 8.2873 KOps/s | 8.3561 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1800ms | 45.1731μs | 22.1370 KOps/s | 22.0677 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2086ms | 0.1493ms | 6.6990 KOps/s | 6.6385 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1568ms | 24.2689μs | 41.2050 KOps/s | 40.9120 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2004ms | 0.1401ms | 7.1398 KOps/s | 7.2378 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 60.8410μs | 21.3058μs | 46.9356 KOps/s | 51.0325 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1992ms | 0.1454ms | 6.8767 KOps/s | 7.2018 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 60.3110μs | 20.8463μs | 47.9702 KOps/s | 51.7000 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2397ms | 0.1449ms | 6.9022 KOps/s | 6.9607 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.4889ms | 23.5764μs | 42.4152 KOps/s | 41.2442 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2065ms | 0.1394ms | 7.1719 KOps/s | 7.1801 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 65.8410μs | 19.7345μs | 50.6726 KOps/s | 49.7924 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1934ms | 0.1394ms | 7.1736 KOps/s | 7.1890 KOps/s | |
test_compile_indexing[int-pytree-eager] | 98.4820μs | 19.9576μs | 50.1062 KOps/s | 50.6246 KOps/s | |
test_mod_add[eager] | 74.5510μs | 33.1648μs | 30.1525 KOps/s | 29.6786 KOps/s | |
test_mod_add[compile] | 0.1943ms | 82.3029μs | 12.1502 KOps/s | 12.6656 KOps/s | |
test_mod_add[compile-overhead] | 0.2974ms | 0.1493ms | 6.6962 KOps/s | 6.0980 KOps/s | |
test_mod_wrap[eager] | 0.3222ms | 0.2458ms | 4.0689 KOps/s | 3.9105 KOps/s | |
test_mod_wrap[compile] | 1.3769ms | 0.2910ms | 3.4366 KOps/s | 3.3769 KOps/s | |
test_mod_wrap[compile-overhead] | 7.6847ms | 4.1189ms | 242.7832 Ops/s | 247.2398 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.4687ms | 1.3339ms | 749.6822 Ops/s | 692.7018 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5726ms | 1.3083ms | 764.3560 Ops/s | 701.2890 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3298ms | 0.8879ms | 1.1262 KOps/s | 992.3240 Ops/s | |
test_seq_add[eager] | 0.1466ms | 97.1126μs | 10.2973 KOps/s | 9.8690 KOps/s | |
test_seq_add[compile] | 0.1326ms | 90.6671μs | 11.0294 KOps/s | 11.1978 KOps/s | |
test_seq_add[compile-overhead] | 0.1655ms | 0.1223ms | 8.1771 KOps/s | 8.1551 KOps/s | |
test_seq_wrap[eager] | 0.5148ms | 0.3949ms | 2.5322 KOps/s | 2.5373 KOps/s | |
test_seq_wrap[compile] | 0.3548ms | 0.3065ms | 3.2631 KOps/s | 3.1169 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2693ms | 0.2221ms | 4.5026 KOps/s | 4.6016 KOps/s | |
test_func_call_runtime[False-eager] | 0.7841ms | 0.7194ms | 1.3900 KOps/s | 1.2983 KOps/s | |
test_func_call_runtime[False-compile] | 0.9784ms | 0.7862ms | 1.2719 KOps/s | 1.2786 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4893ms | 0.3556ms | 2.8121 KOps/s | 2.8095 KOps/s | |
test_func_call_runtime[True-eager] | 0.9905ms | 0.8845ms | 1.1306 KOps/s | 1.1066 KOps/s | |
test_func_call_runtime[True-compile] | 0.8582ms | 0.7973ms | 1.2543 KOps/s | 1.2349 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4438ms | 0.3770ms | 2.6524 KOps/s | 2.6812 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9115ms | 0.7172ms | 1.3943 KOps/s | 1.3577 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8523ms | 0.7712ms | 1.2967 KOps/s | 1.2774 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4057ms | 0.3574ms | 2.7981 KOps/s | 2.8008 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0688ms | 0.9862ms | 1.0140 KOps/s | 988.8272 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.9036ms | 0.8253ms | 1.2116 KOps/s | 1.2024 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4475ms | 0.4015ms | 2.4905 KOps/s | 2.4796 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5650ms | 2.1025ms | 475.6268 Ops/s | 469.5910 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9261ms | 0.8369ms | 1.1949 KOps/s | 1.1813 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4584ms | 0.4075ms | 2.4538 KOps/s | 2.4692 KOps/s | |
test_distributed | 2.3598ms | 0.1720ms | 5.8129 KOps/s | 8.7610 KOps/s | |
test_tdmodule | 36.1510μs | 15.8888μs | 62.9375 KOps/s | 60.7466 KOps/s | |
test_tdmodule_dispatch | 60.3910μs | 31.6683μs | 31.5773 KOps/s | 31.9786 KOps/s | |
test_tdseq | 49.9900μs | 16.8165μs | 59.4653 KOps/s | 59.1002 KOps/s | |
test_tdseq_dispatch | 57.1210μs | 33.9812μs | 29.4280 KOps/s | 29.1847 KOps/s | |
test_instantiation_functorch | 1.9780ms | 1.8246ms | 548.0543 Ops/s | 549.0712 Ops/s | |
test_exec_functorch | 0.2646ms | 0.2034ms | 4.9169 KOps/s | 4.7975 KOps/s | |
test_exec_functional_call | 0.2757ms | 0.2082ms | 4.8039 KOps/s | 4.8082 KOps/s | |
test_exec_td_decorator | 0.4564ms | 0.2630ms | 3.8017 KOps/s | 3.8531 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8333ms | 0.6892ms | 1.4510 KOps/s | 1.4452 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8160ms | 0.6864ms | 1.4569 KOps/s | 1.4476 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7385ms | 0.6070ms | 1.6474 KOps/s | 1.6597 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7040ms | 0.6078ms | 1.6454 KOps/s | 1.6408 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.6510ms | 19.8679ms | 50.3324 Ops/s | 50.7021 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.5037ms | 19.7164ms | 50.7191 Ops/s | 50.2595 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 20.5028ms | 19.8369ms | 50.4111 Ops/s | 51.0525 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 20.9611ms | 19.7388ms | 50.6616 Ops/s | 50.8175 Ops/s | |
test_to_module_speed[True] | 1.4442ms | 0.9878ms | 1.0123 KOps/s | 1.0095 KOps/s | |
test_to_module_speed[False] | 1.4271ms | 0.9685ms | 1.0326 KOps/s | 1.0460 KOps/s | |
test_tc_init | 76.1620μs | 37.6332μs | 26.5723 KOps/s | 26.1944 KOps/s | |
test_tc_init_nested | 0.2014ms | 74.0647μs | 13.5017 KOps/s | 12.9097 KOps/s | |
test_tc_first_layer_tensor | 4.9300μs | 0.6617μs | 1.5112 MOps/s | 1.5082 MOps/s | |
test_tc_first_layer_nontensor | 42.6310μs | 2.1702μs | 460.7816 KOps/s | 459.7175 KOps/s | |
test_tc_second_layer_tensor | 9.6477μs | 1.3201μs | 757.4973 KOps/s | 742.2117 KOps/s | |
test_tc_second_layer_nontensor | 24.1110μs | 2.8740μs | 347.9504 KOps/s | 348.5403 KOps/s | |
test_unbind | 0.1941s | 9.5224ms | 105.0161 Ops/s | 92.7657 Ops/s | |
test_full_like | 0.6638ms | 0.5744ms | 1.7411 KOps/s | 1.7450 KOps/s | |
test_zeros_like | 0.2628ms | 0.1980ms | 5.0517 KOps/s | 5.0463 KOps/s | |
test_ones_like | 0.2332ms | 0.1978ms | 5.0553 KOps/s | 5.0501 KOps/s | |
test_clone | 0.4430ms | 0.4147ms | 2.4113 KOps/s | 2.4101 KOps/s | |
test_squeeze | 44.7600μs | 9.5882μs | 104.2947 KOps/s | 106.6937 KOps/s | |
test_unsqueeze | 0.2169ms | 73.8036μs | 13.5495 KOps/s | 13.8298 KOps/s | |
test_split | 0.4195ms | 0.1513ms | 6.6097 KOps/s | 6.5958 KOps/s | |
test_permute | 0.2893ms | 0.1799ms | 5.5586 KOps/s | 5.6511 KOps/s | |
test_stack | 1.2568ms | 0.8294ms | 1.2057 KOps/s | 1.1538 KOps/s | |
test_cat | 1.2624ms | 1.2313ms | 812.1598 Ops/s | 812.0162 Ops/s |
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: e710b72f185d8c18284fdd6cd4283c78d12a28f3 Pull Request resolved: #1040
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: 5df15306395f6986e77caed2cfa87b3516a1b134 Pull Request resolved: #1040
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):