Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Split-trajectories and represent as nested tensor #2043

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Mar 27, 2024

TODO:

  • Doc

Copy link

pytorch-bot bot commented Mar 27, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2043

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 2 Unrelated Failures

As of commit 0966659 with merge base c98754f (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 27, 2024
Copy link

github-actions bot commented Mar 27, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 91. Improved: $\large\color{#35bf28}2$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 52.9930ms 52.6417ms 18.9963 Ops/s 18.2601 Ops/s $\color{#35bf28}+4.03\%$
test_sync 34.5320ms 29.0502ms 34.4232 Ops/s 34.6320 Ops/s $\color{#d91a1a}-0.60\%$
test_async 55.2590ms 27.2692ms 36.6715 Ops/s 36.5638 Ops/s $\color{#35bf28}+0.29\%$
test_simple 0.3891s 0.3377s 2.9609 Ops/s 3.0368 Ops/s $\color{#d91a1a}-2.50\%$
test_transformed 0.5392s 0.4872s 2.0523 Ops/s 2.1105 Ops/s $\color{#d91a1a}-2.76\%$
test_serial 1.2321s 1.1895s 0.8407 Ops/s 0.8318 Ops/s $\color{#35bf28}+1.07\%$
test_parallel 1.0395s 1.0035s 0.9965 Ops/s 1.0074 Ops/s $\color{#d91a1a}-1.08\%$
test_step_mdp_speed[True-True-True-True-True] 0.1320ms 21.0798μs 47.4388 KOps/s 46.7815 KOps/s $\color{#35bf28}+1.41\%$
test_step_mdp_speed[True-True-True-True-False] 36.7690μs 13.0719μs 76.4998 KOps/s 77.4959 KOps/s $\color{#d91a1a}-1.29\%$
test_step_mdp_speed[True-True-True-False-True] 39.2230μs 12.3224μs 81.1529 KOps/s 80.5098 KOps/s $\color{#35bf28}+0.80\%$
test_step_mdp_speed[True-True-True-False-False] 30.8980μs 7.5050μs 133.2446 KOps/s 133.3265 KOps/s $\color{#d91a1a}-0.06\%$
test_step_mdp_speed[True-True-False-True-True] 74.6370μs 22.4338μs 44.5755 KOps/s 44.3624 KOps/s $\color{#35bf28}+0.48\%$
test_step_mdp_speed[True-True-False-True-False] 42.6900μs 14.1498μs 70.6724 KOps/s 70.7801 KOps/s $\color{#d91a1a}-0.15\%$
test_step_mdp_speed[True-True-False-False-True] 50.2160μs 13.5415μs 73.8471 KOps/s 72.8673 KOps/s $\color{#35bf28}+1.34\%$
test_step_mdp_speed[True-True-False-False-False] 29.3840μs 8.7933μs 113.7231 KOps/s 113.5825 KOps/s $\color{#35bf28}+0.12\%$
test_step_mdp_speed[True-False-True-True-True] 60.3020μs 23.8958μs 41.8484 KOps/s 41.7713 KOps/s $\color{#35bf28}+0.18\%$
test_step_mdp_speed[True-False-True-True-False] 45.3350μs 15.4921μs 64.5490 KOps/s 64.9929 KOps/s $\color{#d91a1a}-0.68\%$
test_step_mdp_speed[True-False-True-False-True] 74.6970μs 13.4483μs 74.3589 KOps/s 73.6671 KOps/s $\color{#35bf28}+0.94\%$
test_step_mdp_speed[True-False-True-False-False] 55.4230μs 8.7235μs 114.6330 KOps/s 114.4842 KOps/s $\color{#35bf28}+0.13\%$
test_step_mdp_speed[True-False-False-True-True] 54.4110μs 24.7344μs 40.4295 KOps/s 39.8062 KOps/s $\color{#35bf28}+1.57\%$
test_step_mdp_speed[True-False-False-True-False] 45.9160μs 16.7168μs 59.8199 KOps/s 60.3278 KOps/s $\color{#d91a1a}-0.84\%$
test_step_mdp_speed[True-False-False-False-True] 50.1630μs 14.5648μs 68.6586 KOps/s 67.6078 KOps/s $\color{#35bf28}+1.55\%$
test_step_mdp_speed[True-False-False-False-False] 39.9140μs 9.9854μs 100.1459 KOps/s 101.3343 KOps/s $\color{#d91a1a}-1.17\%$
test_step_mdp_speed[False-True-True-True-True] 78.2760μs 23.7216μs 42.1557 KOps/s 41.6499 KOps/s $\color{#35bf28}+1.21\%$
test_step_mdp_speed[False-True-True-True-False] 41.9480μs 15.5949μs 64.1237 KOps/s 64.7355 KOps/s $\color{#d91a1a}-0.95\%$
test_step_mdp_speed[False-True-True-False-True] 41.6480μs 15.8970μs 62.9049 KOps/s 62.4508 KOps/s $\color{#35bf28}+0.73\%$
test_step_mdp_speed[False-True-True-False-False] 34.0240μs 9.9343μs 100.6614 KOps/s 100.3801 KOps/s $\color{#35bf28}+0.28\%$
test_step_mdp_speed[False-True-False-True-True] 33.8830μs 25.1344μs 39.7861 KOps/s 39.6693 KOps/s $\color{#35bf28}+0.29\%$
test_step_mdp_speed[False-True-False-True-False] 46.9070μs 16.8700μs 59.2768 KOps/s 60.4679 KOps/s $\color{#d91a1a}-1.97\%$
test_step_mdp_speed[False-True-False-False-True] 47.6280μs 17.0324μs 58.7116 KOps/s 58.4219 KOps/s $\color{#35bf28}+0.50\%$
test_step_mdp_speed[False-True-False-False-False] 28.6030μs 11.2495μs 88.8929 KOps/s 90.8441 KOps/s $\color{#d91a1a}-2.15\%$
test_step_mdp_speed[False-False-True-True-True] 98.7240μs 26.5043μs 37.7297 KOps/s 38.1089 KOps/s $\color{#d91a1a}-1.00\%$
test_step_mdp_speed[False-False-True-True-False] 59.7490μs 17.8349μs 56.0700 KOps/s 56.2813 KOps/s $\color{#d91a1a}-0.38\%$
test_step_mdp_speed[False-False-True-False-True] 40.8560μs 17.0499μs 58.6513 KOps/s 58.9189 KOps/s $\color{#d91a1a}-0.45\%$
test_step_mdp_speed[False-False-True-False-False] 36.8780μs 11.1925μs 89.3452 KOps/s 90.5701 KOps/s $\color{#d91a1a}-1.35\%$
test_step_mdp_speed[False-False-False-True-True] 67.9960μs 27.5934μs 36.2406 KOps/s 36.7411 KOps/s $\color{#d91a1a}-1.36\%$
test_step_mdp_speed[False-False-False-True-False] 67.6660μs 19.0659μs 52.4497 KOps/s 53.1750 KOps/s $\color{#d91a1a}-1.36\%$
test_step_mdp_speed[False-False-False-False-True] 44.9630μs 18.0772μs 55.3183 KOps/s 55.8289 KOps/s $\color{#d91a1a}-0.91\%$
test_step_mdp_speed[False-False-False-False-False] 38.2410μs 12.3021μs 81.2868 KOps/s 82.3361 KOps/s $\color{#d91a1a}-1.27\%$
test_values[generalized_advantage_estimate-True-True] 12.7265ms 9.4160ms 106.2018 Ops/s 110.7687 Ops/s $\color{#d91a1a}-4.12\%$
test_values[vec_generalized_advantage_estimate-True-True] 37.0465ms 35.1177ms 28.4757 Ops/s 29.9452 Ops/s $\color{#d91a1a}-4.91\%$
test_values[td0_return_estimate-False-False] 0.2179ms 0.1639ms 6.1012 KOps/s 5.9995 KOps/s $\color{#35bf28}+1.69\%$
test_values[td1_return_estimate-False-False] 27.8242ms 23.3971ms 42.7404 Ops/s 44.5264 Ops/s $\color{#d91a1a}-4.01\%$
test_values[vec_td1_return_estimate-False-False] 36.5065ms 35.1366ms 28.4603 Ops/s 29.7813 Ops/s $\color{#d91a1a}-4.44\%$
test_values[td_lambda_return_estimate-True-False] 36.1705ms 33.3359ms 29.9977 Ops/s 30.9472 Ops/s $\color{#d91a1a}-3.07\%$
test_values[vec_td_lambda_return_estimate-True-False] 49.7126ms 35.6755ms 28.0305 Ops/s 29.9493 Ops/s $\textbf{\color{#d91a1a}-6.41\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.2565ms 8.1490ms 122.7151 Ops/s 126.0073 Ops/s $\color{#d91a1a}-2.61\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.3369ms 2.0203ms 494.9864 Ops/s 509.8785 Ops/s $\color{#d91a1a}-2.92\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4094ms 0.3419ms 2.9247 KOps/s 2.8584 KOps/s $\color{#35bf28}+2.32\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 49.2548ms 47.1647ms 21.2023 Ops/s 21.1865 Ops/s $\color{#35bf28}+0.07\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.7020ms 3.0186ms 331.2755 Ops/s 331.2658 Ops/s $+0.00\%$
test_dqn_speed 3.4508ms 1.3553ms 737.8199 Ops/s 741.6853 Ops/s $\color{#d91a1a}-0.52\%$
test_ddpg_speed 3.3228ms 2.6787ms 373.3102 Ops/s 372.9411 Ops/s $\color{#35bf28}+0.10\%$
test_sac_speed 8.9728ms 8.1767ms 122.2983 Ops/s 116.6974 Ops/s $\color{#35bf28}+4.80\%$
test_redq_speed 13.9866ms 12.9890ms 76.9883 Ops/s 76.9309 Ops/s $\color{#35bf28}+0.07\%$
test_redq_deprec_speed 14.1330ms 13.0010ms 76.9169 Ops/s 78.0807 Ops/s $\color{#d91a1a}-1.49\%$
test_td3_speed 8.7374ms 8.0383ms 124.4047 Ops/s 122.9945 Ops/s $\color{#35bf28}+1.15\%$
test_cql_speed 0.1066s 39.1987ms 25.5111 Ops/s 27.9745 Ops/s $\textbf{\color{#d91a1a}-8.81\%}$
test_a2c_speed 8.2063ms 7.3441ms 136.1636 Ops/s 136.8095 Ops/s $\color{#d91a1a}-0.47\%$
test_ppo_speed 8.1021ms 7.5928ms 131.7043 Ops/s 131.1416 Ops/s $\color{#35bf28}+0.43\%$
test_reinforce_speed 7.4544ms 6.5379ms 152.9538 Ops/s 153.8957 Ops/s $\color{#d91a1a}-0.61\%$
test_iql_speed 34.2986ms 32.5217ms 30.7487 Ops/s 31.0389 Ops/s $\color{#d91a1a}-0.94\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 3.2507ms 2.1576ms 463.4738 Ops/s 460.4549 Ops/s $\color{#35bf28}+0.66\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9605ms 0.5004ms 1.9984 KOps/s 2.0146 KOps/s $\color{#d91a1a}-0.80\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7287ms 0.4762ms 2.1000 KOps/s 1.9229 KOps/s $\textbf{\color{#35bf28}+9.21\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.5804ms 2.1679ms 461.2656 Ops/s 453.0847 Ops/s $\color{#35bf28}+1.81\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0274ms 0.4920ms 2.0327 KOps/s 2.0526 KOps/s $\color{#d91a1a}-0.97\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6826ms 0.4656ms 2.1480 KOps/s 2.1713 KOps/s $\color{#d91a1a}-1.07\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.7610ms 1.2196ms 819.9480 Ops/s 821.1661 Ops/s $\color{#d91a1a}-0.15\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.7142ms 1.1578ms 863.7089 Ops/s 867.6634 Ops/s $\color{#d91a1a}-0.46\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.7561ms 2.2280ms 448.8355 Ops/s 439.4268 Ops/s $\color{#35bf28}+2.14\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 95.4931ms 0.6866ms 1.4564 KOps/s 1.6366 KOps/s $\textbf{\color{#d91a1a}-11.01\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8661ms 0.5848ms 1.7099 KOps/s 1.7111 KOps/s $\color{#d91a1a}-0.07\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.3591ms 2.1444ms 466.3257 Ops/s 463.0346 Ops/s $\color{#35bf28}+0.71\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.1033ms 0.5031ms 1.9879 KOps/s 2.0017 KOps/s $\color{#d91a1a}-0.69\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6155ms 0.4732ms 2.1132 KOps/s 2.0906 KOps/s $\color{#35bf28}+1.08\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.7529ms 2.1458ms 466.0245 Ops/s 460.6349 Ops/s $\color{#35bf28}+1.17\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8096ms 0.4944ms 2.0228 KOps/s 2.0240 KOps/s $\color{#d91a1a}-0.06\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 4.0391ms 0.4688ms 2.1332 KOps/s 2.1481 KOps/s $\color{#d91a1a}-0.69\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.6838ms 2.2951ms 435.7169 Ops/s 443.6135 Ops/s $\color{#d91a1a}-1.78\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7733ms 0.6150ms 1.6260 KOps/s 1.6362 KOps/s $\color{#d91a1a}-0.62\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9399ms 0.6004ms 1.6656 KOps/s 1.6976 KOps/s $\color{#d91a1a}-1.88\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1003s 7.2513ms 137.9056 Ops/s 138.1715 Ops/s $\color{#d91a1a}-0.19\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 14.1991ms 12.0100ms 83.2640 Ops/s 81.1540 Ops/s $\color{#35bf28}+2.60\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.1573ms 1.0034ms 996.5809 Ops/s 957.6716 Ops/s $\color{#35bf28}+4.06\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 85.0354ms 5.3217ms 187.9101 Ops/s 143.3018 Ops/s $\textbf{\color{#35bf28}+31.13\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 90.8129ms 13.6416ms 73.3050 Ops/s 83.2653 Ops/s $\textbf{\color{#d91a1a}-11.96\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.5191ms 1.0300ms 970.9018 Ops/s 981.9281 Ops/s $\color{#d91a1a}-1.12\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 85.4538ms 5.6412ms 177.2658 Ops/s 179.7719 Ops/s $\color{#d91a1a}-1.39\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 15.0478ms 12.4419ms 80.3734 Ops/s 81.0914 Ops/s $\color{#d91a1a}-0.89\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 4.9022ms 1.4448ms 692.1497 Ops/s 758.9462 Ops/s $\textbf{\color{#d91a1a}-8.80\%}$

Copy link

github-actions bot commented Mar 27, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 94. Improved: $\large\color{#35bf28}8$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1006s 99.2932ms 10.0712 Ops/s 9.4392 Ops/s $\textbf{\color{#35bf28}+6.70\%}$
test_sync 87.1935ms 86.5597ms 11.5527 Ops/s 11.6870 Ops/s $\color{#d91a1a}-1.15\%$
test_async 0.1620s 70.5839ms 14.1675 Ops/s 14.2864 Ops/s $\color{#d91a1a}-0.83\%$
test_single_pixels 0.1102s 0.1099s 9.1001 Ops/s 9.1130 Ops/s $\color{#d91a1a}-0.14\%$
test_sync_pixels 66.9834ms 66.0042ms 15.1505 Ops/s 15.2316 Ops/s $\color{#d91a1a}-0.53\%$
test_async_pixels 0.1210s 55.2348ms 18.1045 Ops/s 17.6808 Ops/s $\color{#35bf28}+2.40\%$
test_simple 0.6645s 0.6638s 1.5064 Ops/s 1.4747 Ops/s $\color{#35bf28}+2.15\%$
test_transformed 0.8894s 0.8891s 1.1247 Ops/s 1.1168 Ops/s $\color{#35bf28}+0.71\%$
test_serial 2.1530s 2.0931s 0.4778 Ops/s 0.4792 Ops/s $\color{#d91a1a}-0.31\%$
test_parallel 1.8509s 1.7929s 0.5578 Ops/s 0.5580 Ops/s $\color{#d91a1a}-0.04\%$
test_step_mdp_speed[True-True-True-True-True] 0.1110ms 32.8919μs 30.4026 KOps/s 30.7632 KOps/s $\color{#d91a1a}-1.17\%$
test_step_mdp_speed[True-True-True-True-False] 38.6310μs 19.5940μs 51.0362 KOps/s 50.7279 KOps/s $\color{#35bf28}+0.61\%$
test_step_mdp_speed[True-True-True-False-True] 42.9400μs 18.6687μs 53.5656 KOps/s 53.6787 KOps/s $\color{#d91a1a}-0.21\%$
test_step_mdp_speed[True-True-True-False-False] 29.5700μs 11.2495μs 88.8928 KOps/s 89.0674 KOps/s $\color{#d91a1a}-0.20\%$
test_step_mdp_speed[True-True-False-True-True] 54.7810μs 34.6971μs 28.8209 KOps/s 29.0551 KOps/s $\color{#d91a1a}-0.81\%$
test_step_mdp_speed[True-True-False-True-False] 47.7210μs 21.4606μs 46.5971 KOps/s 46.5794 KOps/s $\color{#35bf28}+0.04\%$
test_step_mdp_speed[True-True-False-False-True] 38.3210μs 20.1487μs 49.6310 KOps/s 48.1738 KOps/s $\color{#35bf28}+3.02\%$
test_step_mdp_speed[True-True-False-False-False] 29.9500μs 12.9712μs 77.0938 KOps/s 76.3629 KOps/s $\color{#35bf28}+0.96\%$
test_step_mdp_speed[True-False-True-True-True] 73.3010μs 36.8582μs 27.1310 KOps/s 27.3360 KOps/s $\color{#d91a1a}-0.75\%$
test_step_mdp_speed[True-False-True-True-False] 38.6410μs 23.5047μs 42.5446 KOps/s 42.4833 KOps/s $\color{#35bf28}+0.14\%$
test_step_mdp_speed[True-False-True-False-True] 37.2600μs 20.1491μs 49.6301 KOps/s 48.4580 KOps/s $\color{#35bf28}+2.42\%$
test_step_mdp_speed[True-False-True-False-False] 39.2800μs 13.0394μs 76.6907 KOps/s 75.8609 KOps/s $\color{#35bf28}+1.09\%$
test_step_mdp_speed[True-False-False-True-True] 53.1710μs 38.0640μs 26.2715 KOps/s 26.2120 KOps/s $\color{#35bf28}+0.23\%$
test_step_mdp_speed[True-False-False-True-False] 43.2710μs 25.1530μs 39.7567 KOps/s 39.7335 KOps/s $\color{#35bf28}+0.06\%$
test_step_mdp_speed[True-False-False-False-True] 38.4400μs 21.9186μs 45.6233 KOps/s 44.5740 KOps/s $\color{#35bf28}+2.35\%$
test_step_mdp_speed[True-False-False-False-False] 90.7310μs 14.9144μs 67.0493 KOps/s 66.7068 KOps/s $\color{#35bf28}+0.51\%$
test_step_mdp_speed[False-True-True-True-True] 65.1910μs 36.6760μs 27.2658 KOps/s 27.0167 KOps/s $\color{#35bf28}+0.92\%$
test_step_mdp_speed[False-True-True-True-False] 46.9600μs 23.4276μs 42.6847 KOps/s 42.8106 KOps/s $\color{#d91a1a}-0.29\%$
test_step_mdp_speed[False-True-True-False-True] 45.0900μs 24.3380μs 41.0881 KOps/s 41.4003 KOps/s $\color{#d91a1a}-0.75\%$
test_step_mdp_speed[False-True-True-False-False] 35.0700μs 14.9036μs 67.0981 KOps/s 66.7183 KOps/s $\color{#35bf28}+0.57\%$
test_step_mdp_speed[False-True-False-True-True] 75.9410μs 38.6973μs 25.8416 KOps/s 25.7399 KOps/s $\color{#35bf28}+0.39\%$
test_step_mdp_speed[False-True-False-True-False] 45.4200μs 25.3428μs 39.4589 KOps/s 39.0590 KOps/s $\color{#35bf28}+1.02\%$
test_step_mdp_speed[False-True-False-False-True] 54.7710μs 26.1695μs 38.2124 KOps/s 38.1348 KOps/s $\color{#35bf28}+0.20\%$
test_step_mdp_speed[False-True-False-False-False] 63.9010μs 16.6869μs 59.9274 KOps/s 59.5149 KOps/s $\color{#35bf28}+0.69\%$
test_step_mdp_speed[False-False-True-True-True] 57.3510μs 40.4039μs 24.7501 KOps/s 25.1208 KOps/s $\color{#d91a1a}-1.48\%$
test_step_mdp_speed[False-False-True-True-False] 46.3510μs 27.4275μs 36.4597 KOps/s 36.5284 KOps/s $\color{#d91a1a}-0.19\%$
test_step_mdp_speed[False-False-True-False-True] 44.7610μs 25.9088μs 38.5969 KOps/s 38.5772 KOps/s $\color{#35bf28}+0.05\%$
test_step_mdp_speed[False-False-True-False-False] 39.9400μs 16.7579μs 59.6734 KOps/s 59.5156 KOps/s $\color{#35bf28}+0.27\%$
test_step_mdp_speed[False-False-False-True-True] 62.5010μs 41.6534μs 24.0077 KOps/s 23.9746 KOps/s $\color{#35bf28}+0.14\%$
test_step_mdp_speed[False-False-False-True-False] 44.4410μs 28.9145μs 34.5847 KOps/s 34.4163 KOps/s $\color{#35bf28}+0.49\%$
test_step_mdp_speed[False-False-False-False-True] 47.0910μs 27.3858μs 36.5152 KOps/s 36.3700 KOps/s $\color{#35bf28}+0.40\%$
test_step_mdp_speed[False-False-False-False-False] 36.2310μs 18.4270μs 54.2681 KOps/s 53.6516 KOps/s $\color{#35bf28}+1.15\%$
test_values[generalized_advantage_estimate-True-True] 24.9748ms 24.2783ms 41.1890 Ops/s 41.9143 Ops/s $\color{#d91a1a}-1.73\%$
test_values[vec_generalized_advantage_estimate-True-True] 81.8019ms 3.1964ms 312.8518 Ops/s 312.5076 Ops/s $\color{#35bf28}+0.11\%$
test_values[td0_return_estimate-False-False] 95.4720μs 64.1381μs 15.5914 KOps/s 15.8736 KOps/s $\color{#d91a1a}-1.78\%$
test_values[td1_return_estimate-False-False] 52.6833ms 52.1084ms 19.1908 Ops/s 19.4760 Ops/s $\color{#d91a1a}-1.46\%$
test_values[vec_td1_return_estimate-False-False] 2.0421ms 1.7510ms 571.0926 Ops/s 571.0161 Ops/s $\color{#35bf28}+0.01\%$
test_values[td_lambda_return_estimate-True-False] 87.5530ms 84.3040ms 11.8618 Ops/s 12.2416 Ops/s $\color{#d91a1a}-3.10\%$
test_values[vec_td_lambda_return_estimate-True-False] 2.1143ms 1.7562ms 569.4216 Ops/s 574.1879 Ops/s $\color{#d91a1a}-0.83\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 23.2894ms 22.7069ms 44.0395 Ops/s 44.5088 Ops/s $\color{#d91a1a}-1.05\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.8651ms 0.6888ms 1.4517 KOps/s 1.4423 KOps/s $\color{#35bf28}+0.66\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7429ms 0.6383ms 1.5667 KOps/s 1.5737 KOps/s $\color{#d91a1a}-0.45\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5377ms 1.4464ms 691.3728 Ops/s 695.3644 Ops/s $\color{#d91a1a}-0.57\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.9503ms 0.6612ms 1.5125 KOps/s 1.5240 KOps/s $\color{#d91a1a}-0.75\%$
test_dqn_speed 1.8543ms 1.4494ms 689.9381 Ops/s 697.1634 Ops/s $\color{#d91a1a}-1.04\%$
test_ddpg_speed 3.0080ms 2.6976ms 370.6996 Ops/s 369.4498 Ops/s $\color{#35bf28}+0.34\%$
test_sac_speed 8.8396ms 8.1303ms 122.9967 Ops/s 124.5276 Ops/s $\color{#d91a1a}-1.23\%$
test_redq_speed 11.3188ms 10.3445ms 96.6693 Ops/s 98.3058 Ops/s $\color{#d91a1a}-1.66\%$
test_redq_deprec_speed 12.7122ms 11.5757ms 86.3879 Ops/s 89.7039 Ops/s $\color{#d91a1a}-3.70\%$
test_td3_speed 8.2615ms 8.0604ms 124.0629 Ops/s 124.4582 Ops/s $\color{#d91a1a}-0.32\%$
test_cql_speed 29.2839ms 25.0235ms 39.9624 Ops/s 39.9159 Ops/s $\color{#35bf28}+0.12\%$
test_a2c_speed 5.8457ms 5.2575ms 190.2060 Ops/s 180.0008 Ops/s $\textbf{\color{#35bf28}+5.67\%}$
test_ppo_speed 6.2617ms 5.5796ms 179.2252 Ops/s 169.0288 Ops/s $\textbf{\color{#35bf28}+6.03\%}$
test_reinforce_speed 4.4686ms 4.2558ms 234.9746 Ops/s 218.6771 Ops/s $\textbf{\color{#35bf28}+7.45\%}$
test_iql_speed 20.3793ms 18.9678ms 52.7208 Ops/s 50.5631 Ops/s $\color{#35bf28}+4.27\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 3.0743ms 2.9141ms 343.1590 Ops/s 349.7132 Ops/s $\color{#d91a1a}-1.87\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6831ms 0.5478ms 1.8255 KOps/s 1.8515 KOps/s $\color{#d91a1a}-1.41\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 4.5135ms 0.5229ms 1.9124 KOps/s 1.9273 KOps/s $\color{#d91a1a}-0.77\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.1193ms 2.9242ms 341.9765 Ops/s 350.5398 Ops/s $\color{#d91a1a}-2.44\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6839ms 0.5380ms 1.8588 KOps/s 1.8469 KOps/s $\color{#35bf28}+0.64\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 4.4401ms 0.5215ms 1.9176 KOps/s 1.9488 KOps/s $\color{#d91a1a}-1.60\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5939ms 1.4649ms 682.6588 Ops/s 693.0847 Ops/s $\color{#d91a1a}-1.50\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5829ms 1.3937ms 717.5207 Ops/s 730.3826 Ops/s $\color{#d91a1a}-1.76\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.1631ms 3.0354ms 329.4500 Ops/s 337.7659 Ops/s $\color{#d91a1a}-2.46\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.3602ms 0.6759ms 1.4794 KOps/s 1.5142 KOps/s $\color{#d91a1a}-2.29\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9227ms 0.6507ms 1.5368 KOps/s 1.3607 KOps/s $\textbf{\color{#35bf28}+12.94\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.9927ms 2.8915ms 345.8359 Ops/s 346.7904 Ops/s $\color{#d91a1a}-0.28\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.1124s 0.6659ms 1.5018 KOps/s 1.8455 KOps/s $\textbf{\color{#d91a1a}-18.62\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6566ms 0.5220ms 1.9158 KOps/s 1.9051 KOps/s $\color{#35bf28}+0.57\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.1099ms 2.9267ms 341.6787 Ops/s 346.2270 Ops/s $\color{#d91a1a}-1.31\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6254ms 0.5370ms 1.8623 KOps/s 1.4745 KOps/s $\textbf{\color{#35bf28}+26.30\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 4.7106ms 0.5259ms 1.9014 KOps/s 1.9540 KOps/s $\color{#d91a1a}-2.69\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.1334ms 3.0298ms 330.0520 Ops/s 333.5852 Ops/s $\color{#d91a1a}-1.06\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.1070s 0.8060ms 1.2407 KOps/s 1.4941 KOps/s $\textbf{\color{#d91a1a}-16.96\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8226ms 0.6518ms 1.5342 KOps/s 1.5493 KOps/s $\color{#d91a1a}-0.97\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1022s 6.8035ms 146.9824 Ops/s 112.7702 Ops/s $\textbf{\color{#35bf28}+30.34\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 17.4478ms 14.7121ms 67.9714 Ops/s 68.5237 Ops/s $\color{#d91a1a}-0.81\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 2.3201ms 1.2001ms 833.2683 Ops/s 923.1280 Ops/s $\textbf{\color{#d91a1a}-9.73\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1044s 8.8059ms 113.5598 Ops/s 147.6741 Ops/s $\textbf{\color{#d91a1a}-23.10\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 16.8006ms 14.5994ms 68.4960 Ops/s 68.5710 Ops/s $\color{#d91a1a}-0.11\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.4425ms 1.3049ms 766.3264 Ops/s 929.5259 Ops/s $\textbf{\color{#d91a1a}-17.56\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1049s 7.2461ms 138.0050 Ops/s 109.0252 Ops/s $\textbf{\color{#35bf28}+26.58\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 17.1322ms 14.9611ms 66.8398 Ops/s 67.0851 Ops/s $\color{#d91a1a}-0.37\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.4233ms 1.6503ms 605.9687 Ops/s 693.4629 Ops/s $\textbf{\color{#d91a1a}-12.62\%}$

@vmoens vmoens added the enhancement New feature or request label Mar 27, 2024
@vmoens vmoens marked this pull request as draft March 27, 2024 15:36
Comment on lines +62 to +68
as_nested (bool, optional): whether to return the results as nested
tensors. Defaults to ``False``.\

.. note:: Using ``split_trajectories(tensordict, as_nested=True).to_padded_tensor(mask=mask_key)``
should result in the exact same result as ``as_nested=False``. Since this is an experimental
feature and relies on nested_tensors, which API may change in the future, we made this
an optional feature. The runtime should be faster with ``as_nested=True``.
Copy link
Contributor Author

@vmoens vmoens Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vmoens vmoens marked this pull request as ready for review March 28, 2024 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants