[WIP, CI] Pre-release submitit scripts #1782

vmoens · 2024-01-09T16:06:37Z

Description

In this PR, I propose a script to run all our benchmarks before the release.

cc @matteobettini @albertbou92 @BY571 @giadefa

TODO:

make sure Wandb logging is uniform across scripts
Add wandb flag with release name and git commit
find a way to report these results and compare them across releases (public wandb channel?)

pytorch-bot · 2024-01-09T16:06:41Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1782

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (4 Unrelated Failures)

As of commit 402c339 with merge base 6c68f7e ():

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

Examples Tests on Linux / tests (3.9, 12.1) / linux-job (gh)
RuntimeError: Command docker exec -t 36edb9b5d063bedecd77dd458c5052e8f9543bdc6e2f2c50e6ab92d5118d378b /exec failed with exit code 1
Unit-tests on Linux CPU / tests (3.11) / linux-job (gh)
test/test_libs.py::TestGym::test_vecenvs_env[CartPole-v1]
Unit-tests on Windows / unittests-gpu / windows-job (gh)
##[error]The operation was canceled.

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Habitat Tests on Linux / tests (3.9, 11.6) / linux-job (gh)
test/test_libs.py::TestHabitat::test_habitat_render[False-HabitatPick-v0]

This comment was automatically generated by Dr. CI and updates every 15 minutes.

matteobettini · 2024-01-09T16:13:35Z

This is amazing! The public wandb project with possibility to filter through releases would be super cool.

It would be cool if these scripts autogenerated an output file and automatically compared it with the one generated from the previous release.

Maybe just on values like time taken and final reward.

github-actions · 2024-01-09T16:14:36Z

$\color{#D29922}\textsf{\Large&#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 89. Improved: $\large\color{#35bf28}1$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results

Name	Max	Mean	Ops	Ops on Repo `HEAD`	Change
test_single	66.8216ms	64.9256ms	15.4023 Ops/s	15.4669 Ops/s	$\color{#d91a1a}-0.42\%$
test_sync	39.2593ms	34.7817ms	28.7507 Ops/s	28.1704 Ops/s	$\color{#35bf28}+2.06\%$
test_async	0.1052s	34.3404ms	29.1202 Ops/s	29.6435 Ops/s	$\color{#d91a1a}-1.77\%$
test_simple	0.5109s	0.4536s	2.2045 Ops/s	2.2320 Ops/s	$\color{#d91a1a}-1.23\%$
test_transformed	0.6802s	0.6237s	1.6033 Ops/s	1.6666 Ops/s	$\color{#d91a1a}-3.80\%$
test_serial	1.4811s	1.4370s	0.6959 Ops/s	0.7393 Ops/s	$\textbf{\color{#d91a1a}-5.88\%}$
test_parallel	1.4246s	1.3623s	0.7340 Ops/s	0.7238 Ops/s	$\color{#35bf28}+1.42\%$
test_step_mdp_speed[True-True-True-True-True]	0.1175ms	22.3496μs	44.7435 KOps/s	47.0203 KOps/s	$\color{#d91a1a}-4.84\%$
test_step_mdp_speed[True-True-True-True-False]	43.0400μs	13.5767μs	73.6556 KOps/s	76.4892 KOps/s	$\color{#d91a1a}-3.70\%$
test_step_mdp_speed[True-True-True-False-True]	60.0920μs	13.1486μs	76.0537 KOps/s	78.4765 KOps/s	$\color{#d91a1a}-3.09\%$
test_step_mdp_speed[True-True-True-False-False]	28.6940μs	7.9373μs	125.9881 KOps/s	129.0166 KOps/s	$\color{#d91a1a}-2.35\%$
test_step_mdp_speed[True-True-False-True-True]	48.9310μs	23.6160μs	42.3442 KOps/s	44.0039 KOps/s	$\color{#d91a1a}-3.77\%$
test_step_mdp_speed[True-True-False-True-False]	40.2850μs	14.7116μs	67.9736 KOps/s	68.8358 KOps/s	$\color{#d91a1a}-1.25\%$
test_step_mdp_speed[True-True-False-False-True]	37.3090μs	14.2364μs	70.2425 KOps/s	71.8411 KOps/s	$\color{#d91a1a}-2.23\%$
test_step_mdp_speed[True-True-False-False-False]	31.1990μs	9.1455μs	109.3430 KOps/s	111.7120 KOps/s	$\color{#d91a1a}-2.12\%$
test_step_mdp_speed[True-False-True-True-True]	56.3150μs	25.1017μs	39.8380 KOps/s	41.9618 KOps/s	$\textbf{\color{#d91a1a}-5.06\%}$
test_step_mdp_speed[True-False-True-True-False]	44.5730μs	16.1751μs	61.8236 KOps/s	63.4627 KOps/s	$\color{#d91a1a}-2.58\%$
test_step_mdp_speed[True-False-True-False-True]	46.6270μs	14.4115μs	69.3892 KOps/s	71.9262 KOps/s	$\color{#d91a1a}-3.53\%$
test_step_mdp_speed[True-False-True-False-False]	39.0620μs	9.8542μs	101.4797 KOps/s	112.0567 KOps/s	$\textbf{\color{#d91a1a}-9.44\%}$
test_step_mdp_speed[True-False-False-True-True]	53.0300μs	26.2975μs	38.0264 KOps/s	39.2202 KOps/s	$\color{#d91a1a}-3.04\%$
test_step_mdp_speed[True-False-False-True-False]	42.1590μs	17.5322μs	57.0379 KOps/s	59.6444 KOps/s	$\color{#d91a1a}-4.37\%$
test_step_mdp_speed[True-False-False-False-True]	40.3860μs	15.5002μs	64.5153 KOps/s	66.2680 KOps/s	$\color{#d91a1a}-2.64\%$
test_step_mdp_speed[True-False-False-False-False]	27.0800μs	10.4516μs	95.6792 KOps/s	98.8877 KOps/s	$\color{#d91a1a}-3.24\%$
test_step_mdp_speed[False-True-True-True-True]	56.3450μs	25.1789μs	39.7157 KOps/s	41.1181 KOps/s	$\color{#d91a1a}-3.41\%$
test_step_mdp_speed[False-True-True-True-False]	41.7280μs	16.4593μs	60.7560 KOps/s	63.2837 KOps/s	$\color{#d91a1a}-3.99\%$
test_step_mdp_speed[False-True-True-False-True]	58.7700μs	16.7239μs	59.7945 KOps/s	61.1895 KOps/s	$\color{#d91a1a}-2.28\%$
test_step_mdp_speed[False-True-True-False-False]	34.3150μs	10.3588μs	96.5363 KOps/s	97.6159 KOps/s	$\color{#d91a1a}-1.11\%$
test_step_mdp_speed[False-True-False-True-True]	57.5180μs	26.2216μs	38.1366 KOps/s	39.4405 KOps/s	$\color{#d91a1a}-3.31\%$
test_step_mdp_speed[False-True-False-True-False]	42.7090μs	17.4407μs	57.3371 KOps/s	59.1696 KOps/s	$\color{#d91a1a}-3.10\%$
test_step_mdp_speed[False-True-False-False-True]	44.2630μs	17.9407μs	55.7392 KOps/s	58.0454 KOps/s	$\color{#d91a1a}-3.97\%$
test_step_mdp_speed[False-True-False-False-False]	49.3930μs	11.5441μs	86.6240 KOps/s	88.1825 KOps/s	$\color{#d91a1a}-1.77\%$
test_step_mdp_speed[False-False-True-True-True]	69.7400μs	27.6900μs	36.1141 KOps/s	37.4561 KOps/s	$\color{#d91a1a}-3.58\%$
test_step_mdp_speed[False-False-True-True-False]	40.8860μs	18.6637μs	53.5801 KOps/s	54.6284 KOps/s	$\color{#d91a1a}-1.92\%$
test_step_mdp_speed[False-False-True-False-True]	39.1630μs	17.9478μs	55.7171 KOps/s	58.0787 KOps/s	$\color{#d91a1a}-4.07\%$
test_step_mdp_speed[False-False-True-False-False]	36.3680μs	11.6524μs	85.8192 KOps/s	88.7704 KOps/s	$\color{#d91a1a}-3.32\%$
test_step_mdp_speed[False-False-False-True-True]	59.7920μs	28.5760μs	34.9944 KOps/s	36.6186 KOps/s	$\color{#d91a1a}-4.44\%$
test_step_mdp_speed[False-False-False-True-False]	51.4460μs	20.0545μs	49.8640 KOps/s	52.4296 KOps/s	$\color{#d91a1a}-4.89\%$
test_step_mdp_speed[False-False-False-False-True]	91.0510μs	19.0771μs	52.4188 KOps/s	54.9626 KOps/s	$\color{#d91a1a}-4.63\%$
test_step_mdp_speed[False-False-False-False-False]	35.1560μs	12.7653μs	78.3375 KOps/s	80.6416 KOps/s	$\color{#d91a1a}-2.86\%$
test_values[generalized_advantage_estimate-True-True]	15.7119ms	11.8989ms	84.0412 Ops/s	83.9947 Ops/s	$\color{#35bf28}+0.06\%$
test_values[vec_generalized_advantage_estimate-True-True]	34.0547ms	26.2455ms	38.1017 Ops/s	38.1275 Ops/s	$\color{#d91a1a}-0.07\%$
test_values[td0_return_estimate-False-False]	0.3057ms	0.1760ms	5.6814 KOps/s	5.7152 KOps/s	$\color{#d91a1a}-0.59\%$
test_values[td1_return_estimate-False-False]	33.5597ms	25.3332ms	39.4739 Ops/s	38.2633 Ops/s	$\color{#35bf28}+3.16\%$
test_values[vec_td1_return_estimate-False-False]	34.6362ms	26.2301ms	38.1241 Ops/s	37.9592 Ops/s	$\color{#35bf28}+0.43\%$
test_values[td_lambda_return_estimate-True-False]	39.3403ms	35.0746ms	28.5106 Ops/s	28.1280 Ops/s	$\color{#35bf28}+1.36\%$
test_values[vec_td_lambda_return_estimate-True-False]	34.7498ms	26.4254ms	37.8424 Ops/s	37.7579 Ops/s	$\color{#35bf28}+0.22\%$
test_gae_speed[generalized_advantage_estimate-False-1-512]	8.0287ms	7.8961ms	126.6450 Ops/s	126.2066 Ops/s	$\color{#35bf28}+0.35\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512]	2.1974ms	1.9560ms	511.2509 Ops/s	521.9822 Ops/s	$\color{#d91a1a}-2.06\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512]	14.5012ms	0.4558ms	2.1938 KOps/s	2.2555 KOps/s	$\color{#d91a1a}-2.73\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512]	45.8874ms	38.1987ms	26.1789 Ops/s	25.5760 Ops/s	$\color{#35bf28}+2.36\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512]	11.6498ms	2.6300ms	380.2275 Ops/s	379.8000 Ops/s	$\color{#35bf28}+0.11\%$
test_dqn_speed	80.1938ms	8.4111ms	118.8904 Ops/s	121.0248 Ops/s	$\color{#d91a1a}-1.76\%$
test_ddpg_speed	20.1835ms	14.8061ms	67.5397 Ops/s	68.0930 Ops/s	$\color{#d91a1a}-0.81\%$
test_sac_speed	30.6873ms	29.8018ms	33.5550 Ops/s	33.5209 Ops/s	$\color{#35bf28}+0.10\%$
test_redq_speed	44.4541ms	36.1719ms	27.6457 Ops/s	27.8995 Ops/s	$\color{#d91a1a}-0.91\%$
test_redq_deprec_speed	31.6427ms	25.6248ms	39.0248 Ops/s	38.8800 Ops/s	$\color{#35bf28}+0.37\%$
test_td3_speed	30.0732ms	20.6084ms	48.5239 Ops/s	49.3353 Ops/s	$\color{#d91a1a}-1.64\%$
test_cql_speed	90.4254ms	88.1630ms	11.3426 Ops/s	11.2550 Ops/s	$\color{#35bf28}+0.78\%$
test_a2c_speed	36.2691ms	27.2973ms	36.6337 Ops/s	37.0574 Ops/s	$\color{#d91a1a}-1.14\%$
test_ppo_speed	38.8501ms	27.5059ms	36.3558 Ops/s	37.0466 Ops/s	$\color{#d91a1a}-1.86\%$
test_reinforce_speed	35.3261ms	26.2866ms	38.0422 Ops/s	38.5365 Ops/s	$\color{#d91a1a}-1.28\%$
test_iql_speed	71.4600ms	65.1847ms	15.3410 Ops/s	15.8497 Ops/s	$\color{#d91a1a}-3.21\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000]	1.7831ms	1.4481ms	690.5596 Ops/s	704.3169 Ops/s	$\color{#d91a1a}-1.95\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000]	0.6313ms	0.5132ms	1.9484 KOps/s	1.9162 KOps/s	$\color{#35bf28}+1.68\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000]	8.9688ms	0.5055ms	1.9782 KOps/s	1.9825 KOps/s	$\color{#d91a1a}-0.22\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000]	2.1903ms	1.4106ms	708.9274 Ops/s	714.3262 Ops/s	$\color{#d91a1a}-0.76\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000]	9.0202ms	0.5198ms	1.9237 KOps/s	1.9368 KOps/s	$\color{#d91a1a}-0.67\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000]	8.7208ms	0.5029ms	1.9885 KOps/s	2.0269 KOps/s	$\color{#d91a1a}-1.89\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000]	2.3127ms	1.6332ms	612.2979 Ops/s	623.0102 Ops/s	$\color{#d91a1a}-1.72\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000]	4.4538ms	0.6528ms	1.5318 KOps/s	1.5334 KOps/s	$\color{#d91a1a}-0.11\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000]	8.9255ms	0.6399ms	1.5626 KOps/s	1.3300 KOps/s	$\textbf{\color{#35bf28}+17.49\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000]	2.4215ms	1.4542ms	687.6842 Ops/s	694.6241 Ops/s	$\color{#d91a1a}-1.00\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000]	8.7727ms	0.5295ms	1.8885 KOps/s	1.9041 KOps/s	$\color{#d91a1a}-0.82\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000]	2.3473ms	0.5047ms	1.9816 KOps/s	2.0093 KOps/s	$\color{#d91a1a}-1.38\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000]	2.1466ms	1.4198ms	704.3207 Ops/s	721.3529 Ops/s	$\color{#d91a1a}-2.36\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000]	0.6683ms	0.5089ms	1.9649 KOps/s	1.9649 KOps/s	$-0.00\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000]	0.1415s	0.7063ms	1.4158 KOps/s	1.9674 KOps/s	$\textbf{\color{#d91a1a}-28.04\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000]	4.1658ms	1.7226ms	580.5185 Ops/s	622.8783 Ops/s	$\textbf{\color{#d91a1a}-6.80\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000]	0.8438ms	0.6506ms	1.5370 KOps/s	1.5364 KOps/s	$\color{#35bf28}+0.03\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000]	8.9623ms	0.6599ms	1.5154 KOps/s	1.5387 KOps/s	$\color{#d91a1a}-1.52\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400]	0.1370s	17.8499ms	56.0227 Ops/s	59.6306 Ops/s	$\textbf{\color{#d91a1a}-6.05\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400]	14.6153ms	12.2742ms	81.4718 Ops/s	81.4000 Ops/s	$\color{#35bf28}+0.09\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400]	5.4625ms	1.6164ms	618.6442 Ops/s	618.0077 Ops/s	$\color{#35bf28}+0.10\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400]	0.1256s	17.1977ms	58.1473 Ops/s	61.3283 Ops/s	$\textbf{\color{#d91a1a}-5.19\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400]	18.2000ms	12.3785ms	80.7854 Ops/s	80.5087 Ops/s	$\color{#35bf28}+0.34\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400]	2.3188ms	1.5395ms	649.5434 Ops/s	628.9736 Ops/s	$\color{#35bf28}+3.27\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400]	0.1201s	17.0992ms	58.4823 Ops/s	60.2170 Ops/s	$\color{#d91a1a}-2.88\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400]	19.1332ms	12.6419ms	79.1023 Ops/s	80.2037 Ops/s	$\color{#d91a1a}-1.37\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400]	2.4325ms	1.7516ms	570.9155 Ops/s	598.3157 Ops/s	$\color{#d91a1a}-4.58\%$

github-actions · 2024-01-09T16:20:22Z

$\color{#D29922}\textsf{\Large&#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 92. Improved: $\large\color{#35bf28}4$. Worsened: $\large\color{#d91a1a}3$.

Expand to view detailed results

Name	Max	Mean	Ops	Ops on Repo `HEAD`	Change
test_single	0.1196s	0.1192s	8.3889 Ops/s	8.3628 Ops/s	$\color{#35bf28}+0.31\%$
test_sync	0.1778s	0.1094s	9.1387 Ops/s	9.0717 Ops/s	$\color{#35bf28}+0.74\%$
test_async	0.2633s	97.3503ms	10.2722 Ops/s	9.9745 Ops/s	$\color{#35bf28}+2.98\%$
test_single_pixels	0.1432s	0.1425s	7.0160 Ops/s	7.0406 Ops/s	$\color{#d91a1a}-0.35\%$
test_sync_pixels	95.6100ms	94.3420ms	10.5997 Ops/s	9.8321 Ops/s	$\textbf{\color{#35bf28}+7.81\%}$
test_async_pixels	0.2530s	91.7800ms	10.8956 Ops/s	10.9130 Ops/s	$\color{#d91a1a}-0.16\%$
test_simple	0.9327s	0.8656s	1.1553 Ops/s	1.1278 Ops/s	$\color{#35bf28}+2.44\%$
test_transformed	1.1712s	1.1104s	0.9006 Ops/s	0.8994 Ops/s	$\color{#35bf28}+0.13\%$
test_serial	2.4932s	2.4363s	0.4105 Ops/s	0.4120 Ops/s	$\color{#d91a1a}-0.37\%$
test_parallel	2.5200s	2.4508s	0.4080 Ops/s	0.4043 Ops/s	$\color{#35bf28}+0.93\%$
test_step_mdp_speed[True-True-True-True-True]	90.6810μs	32.9361μs	30.3618 KOps/s	29.5858 KOps/s	$\color{#35bf28}+2.62\%$
test_step_mdp_speed[True-True-True-True-False]	57.9910μs	19.5841μs	51.0619 KOps/s	50.4642 KOps/s	$\color{#35bf28}+1.18\%$
test_step_mdp_speed[True-True-True-False-True]	37.4010μs	19.0009μs	52.6290 KOps/s	51.3289 KOps/s	$\color{#35bf28}+2.53\%$
test_step_mdp_speed[True-True-True-False-False]	32.5810μs	11.2487μs	88.8992 KOps/s	87.3045 KOps/s	$\color{#35bf28}+1.83\%$
test_step_mdp_speed[True-True-False-True-True]	54.6810μs	35.0199μs	28.5552 KOps/s	28.1636 KOps/s	$\color{#35bf28}+1.39\%$
test_step_mdp_speed[True-True-False-True-False]	53.9210μs	21.5418μs	46.4214 KOps/s	46.2731 KOps/s	$\color{#35bf28}+0.32\%$
test_step_mdp_speed[True-True-False-False-True]	73.0710μs	20.7500μs	48.1929 KOps/s	46.6230 KOps/s	$\color{#35bf28}+3.37\%$
test_step_mdp_speed[True-True-False-False-False]	31.5300μs	13.1707μs	75.9264 KOps/s	73.3395 KOps/s	$\color{#35bf28}+3.53\%$
test_step_mdp_speed[True-False-True-True-True]	0.1014ms	36.8127μs	27.1645 KOps/s	26.8107 KOps/s	$\color{#35bf28}+1.32\%$
test_step_mdp_speed[True-False-True-True-False]	47.3610μs	23.5478μs	42.4668 KOps/s	42.2046 KOps/s	$\color{#35bf28}+0.62\%$
test_step_mdp_speed[True-False-True-False-True]	40.9210μs	20.7283μs	48.2432 KOps/s	47.4387 KOps/s	$\color{#35bf28}+1.70\%$
test_step_mdp_speed[True-False-True-False-False]	43.9500μs	13.1196μs	76.2219 KOps/s	74.8677 KOps/s	$\color{#35bf28}+1.81\%$
test_step_mdp_speed[True-False-False-True-True]	64.9510μs	38.7008μs	25.8393 KOps/s	25.8605 KOps/s	$\color{#d91a1a}-0.08\%$
test_step_mdp_speed[True-False-False-True-False]	51.0000μs	25.0449μs	39.9283 KOps/s	39.3307 KOps/s	$\color{#35bf28}+1.52\%$
test_step_mdp_speed[True-False-False-False-True]	45.6010μs	22.4669μs	44.5100 KOps/s	43.2092 KOps/s	$\color{#35bf28}+3.01\%$
test_step_mdp_speed[True-False-False-False-False]	61.9910μs	14.9370μs	66.9479 KOps/s	65.7302 KOps/s	$\color{#35bf28}+1.85\%$
test_step_mdp_speed[False-True-True-True-True]	65.9410μs	36.7479μs	27.2124 KOps/s	26.5647 KOps/s	$\color{#35bf28}+2.44\%$
test_step_mdp_speed[False-True-True-True-False]	50.9810μs	23.4717μs	42.6045 KOps/s	41.7733 KOps/s	$\color{#35bf28}+1.99\%$
test_step_mdp_speed[False-True-True-False-True]	48.5810μs	24.6899μs	40.5024 KOps/s	40.2473 KOps/s	$\color{#35bf28}+0.63\%$
test_step_mdp_speed[False-True-True-False-False]	73.0810μs	15.1080μs	66.1899 KOps/s	66.0189 KOps/s	$\color{#35bf28}+0.26\%$
test_step_mdp_speed[False-True-False-True-True]	78.0510μs	38.6492μs	25.8738 KOps/s	25.4715 KOps/s	$\color{#35bf28}+1.58\%$
test_step_mdp_speed[False-True-False-True-False]	45.7810μs	25.4847μs	39.2392 KOps/s	38.8202 KOps/s	$\color{#35bf28}+1.08\%$
test_step_mdp_speed[False-True-False-False-True]	58.2910μs	26.8036μs	37.3084 KOps/s	37.6987 KOps/s	$\color{#d91a1a}-1.04\%$
test_step_mdp_speed[False-True-False-False-False]	36.0400μs	17.1130μs	58.4350 KOps/s	58.1277 KOps/s	$\color{#35bf28}+0.53\%$
test_step_mdp_speed[False-False-True-True-True]	87.7220μs	41.1871μs	24.2794 KOps/s	24.4526 KOps/s	$\color{#d91a1a}-0.71\%$
test_step_mdp_speed[False-False-True-True-False]	70.6510μs	27.4601μs	36.4165 KOps/s	36.2970 KOps/s	$\color{#35bf28}+0.33\%$
test_step_mdp_speed[False-False-True-False-True]	48.9410μs	26.7796μs	37.3419 KOps/s	37.6908 KOps/s	$\color{#d91a1a}-0.93\%$
test_step_mdp_speed[False-False-True-False-False]	42.6800μs	16.8912μs	59.2024 KOps/s	58.3662 KOps/s	$\color{#35bf28}+1.43\%$
test_step_mdp_speed[False-False-False-True-True]	79.4010μs	42.3454μs	23.6153 KOps/s	23.6587 KOps/s	$\color{#d91a1a}-0.18\%$
test_step_mdp_speed[False-False-False-True-False]	59.3010μs	29.1572μs	34.2968 KOps/s	33.6295 KOps/s	$\color{#35bf28}+1.98\%$
test_step_mdp_speed[False-False-False-False-True]	47.6400μs	27.9768μs	35.7439 KOps/s	35.8014 KOps/s	$\color{#d91a1a}-0.16\%$
test_step_mdp_speed[False-False-False-False-False]	85.6620μs	18.3799μs	54.4072 KOps/s	52.5897 KOps/s	$\color{#35bf28}+3.46\%$
test_values[generalized_advantage_estimate-True-True]	23.9876ms	23.3469ms	42.8323 Ops/s	42.3087 Ops/s	$\color{#35bf28}+1.24\%$
test_values[vec_generalized_advantage_estimate-True-True]	88.8255ms	3.3244ms	300.8069 Ops/s	306.8661 Ops/s	$\color{#d91a1a}-1.97\%$
test_values[td0_return_estimate-False-False]	92.5510μs	59.9724μs	16.6743 KOps/s	16.4200 KOps/s	$\color{#35bf28}+1.55\%$
test_values[td1_return_estimate-False-False]	52.1700ms	50.6061ms	19.7605 Ops/s	19.6306 Ops/s	$\color{#35bf28}+0.66\%$
test_values[vec_td1_return_estimate-False-False]	2.0767ms	1.7446ms	573.1839 Ops/s	573.6050 Ops/s	$\color{#d91a1a}-0.07\%$
test_values[td_lambda_return_estimate-True-False]	83.3281ms	80.8473ms	12.3690 Ops/s	12.2991 Ops/s	$\color{#35bf28}+0.57\%$
test_values[vec_td_lambda_return_estimate-True-False]	2.0676ms	1.7351ms	576.3429 Ops/s	574.7114 Ops/s	$\color{#35bf28}+0.28\%$
test_gae_speed[generalized_advantage_estimate-False-1-512]	22.5450ms	22.1846ms	45.0763 Ops/s	44.6310 Ops/s	$\color{#35bf28}+1.00\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512]	0.8149ms	0.6784ms	1.4740 KOps/s	1.4670 KOps/s	$\color{#35bf28}+0.47\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512]	0.7103ms	0.6325ms	1.5811 KOps/s	1.5689 KOps/s	$\color{#35bf28}+0.78\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512]	1.5048ms	1.4395ms	694.6992 Ops/s	696.1709 Ops/s	$\color{#d91a1a}-0.21\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512]	0.9033ms	0.6676ms	1.4979 KOps/s	1.5215 KOps/s	$\color{#d91a1a}-1.55\%$
test_dqn_speed	13.7375ms	7.2451ms	138.0241 Ops/s	138.0150 Ops/s	$+0.01\%$
test_ddpg_speed	15.0469ms	14.1560ms	70.6413 Ops/s	71.5739 Ops/s	$\color{#d91a1a}-1.30\%$
test_sac_speed	29.4214ms	28.5475ms	35.0294 Ops/s	35.3562 Ops/s	$\color{#d91a1a}-0.92\%$
test_redq_speed	35.1451ms	34.2067ms	29.2340 Ops/s	29.1921 Ops/s	$\color{#35bf28}+0.14\%$
test_redq_deprec_speed	24.3909ms	23.2884ms	42.9399 Ops/s	42.9790 Ops/s	$\color{#d91a1a}-0.09\%$
test_td3_speed	28.1251ms	19.3191ms	51.7623 Ops/s	52.1326 Ops/s	$\color{#d91a1a}-0.71\%$
test_cql_speed	83.3506ms	82.1534ms	12.1724 Ops/s	12.3287 Ops/s	$\color{#d91a1a}-1.27\%$
test_a2c_speed	26.3946ms	26.1196ms	38.2854 Ops/s	38.2656 Ops/s	$\color{#35bf28}+0.05\%$
test_ppo_speed	27.3573ms	26.5499ms	37.6649 Ops/s	38.0380 Ops/s	$\color{#d91a1a}-0.98\%$
test_reinforce_speed	26.0804ms	25.2919ms	39.5384 Ops/s	39.6272 Ops/s	$\color{#d91a1a}-0.22\%$
test_iql_speed	57.1844ms	56.3166ms	17.7567 Ops/s	17.8549 Ops/s	$\color{#d91a1a}-0.55\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000]	2.3268ms	1.8978ms	526.9181 Ops/s	524.7346 Ops/s	$\color{#35bf28}+0.42\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000]	2.2263ms	0.8370ms	1.1947 KOps/s	1.1968 KOps/s	$\color{#d91a1a}-0.17\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000]	1.0177ms	0.8239ms	1.2137 KOps/s	1.2177 KOps/s	$\color{#d91a1a}-0.33\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000]	2.0022ms	1.8520ms	539.9532 Ops/s	527.4472 Ops/s	$\color{#35bf28}+2.37\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000]	2.0396ms	0.8261ms	1.2105 KOps/s	1.2181 KOps/s	$\color{#d91a1a}-0.62\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000]	0.9894ms	0.8151ms	1.2269 KOps/s	1.2257 KOps/s	$\color{#35bf28}+0.09\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000]	5.3461ms	2.1598ms	463.0027 Ops/s	463.6846 Ops/s	$\color{#d91a1a}-0.15\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000]	3.4850ms	0.9558ms	1.0462 KOps/s	1.0523 KOps/s	$\color{#d91a1a}-0.57\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000]	1.1236ms	0.9423ms	1.0612 KOps/s	908.6131 Ops/s	$\textbf{\color{#35bf28}+16.80\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000]	2.5457ms	1.9022ms	525.7207 Ops/s	527.1238 Ops/s	$\color{#d91a1a}-0.27\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000]	2.1139ms	0.8384ms	1.1927 KOps/s	1.1997 KOps/s	$\color{#d91a1a}-0.58\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000]	0.9727ms	0.8276ms	1.2084 KOps/s	1.2075 KOps/s	$\color{#35bf28}+0.07\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000]	2.5532ms	1.8709ms	534.5032 Ops/s	533.3138 Ops/s	$\color{#35bf28}+0.22\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000]	0.9440ms	0.8255ms	1.2114 KOps/s	1.2135 KOps/s	$\color{#d91a1a}-0.18\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000]	5.0352ms	0.8209ms	1.2182 KOps/s	1.2295 KOps/s	$\color{#d91a1a}-0.91\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000]	3.2403ms	2.1737ms	460.0459 Ops/s	462.5900 Ops/s	$\color{#d91a1a}-0.55\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000]	1.0982ms	0.9524ms	1.0500 KOps/s	1.0523 KOps/s	$\color{#d91a1a}-0.22\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000]	0.1467s	1.1273ms	887.0566 Ops/s	1.0636 KOps/s	$\textbf{\color{#d91a1a}-16.60\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400]	0.1201s	12.7497ms	78.4331 Ops/s	55.6738 Ops/s	$\textbf{\color{#35bf28}+40.88\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400]	0.1251s	14.6243ms	68.3793 Ops/s	80.6906 Ops/s	$\textbf{\color{#d91a1a}-15.26\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400]	2.5588ms	1.8229ms	548.5674 Ops/s	533.2028 Ops/s	$\color{#35bf28}+2.88\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400]	0.1212s	15.0037ms	66.6500 Ops/s	66.4202 Ops/s	$\color{#35bf28}+0.35\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400]	15.2318ms	12.4736ms	80.1693 Ops/s	68.1547 Ops/s	$\textbf{\color{#35bf28}+17.63\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400]	2.6310ms	1.8714ms	534.3568 Ops/s	518.5103 Ops/s	$\color{#35bf28}+3.06\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400]	0.1228s	17.4554ms	57.2889 Ops/s	65.9708 Ops/s	$\textbf{\color{#d91a1a}-13.16\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400]	15.9176ms	12.5106ms	79.9319 Ops/s	79.5135 Ops/s	$\color{#35bf28}+0.53\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400]	2.8028ms	2.0414ms	489.8618 Ops/s	490.5171 Ops/s	$\color{#d91a1a}-0.13\%$

vmoens · 2024-01-16T17:48:55Z

@albertbou92 Do you plan on working on this or should I keep on doing it?
No worry if you don't have time but we need it to be wrapped by end of next week :)

albertbou92 · 2024-01-16T19:00:08Z

yes, I think I will have some time, but maybe I need a bit of guidance.
So the idea is:

Make sure that all training scripts log to wandb
Review the examples to unify the metrics logged
Centralise all generated data in a specified dir I guess?

right? and then we manually run the script whenever we want to check all examples work as expected and verify that by visual inspection in wandb.

vmoens · 2024-01-17T11:58:02Z

So the idea is:

First check if we can get all scripts to run ok with something as simple as what I drafted here :)

Make sure that all training scripts log to wandb

Yes and logging should have some uniform format with little addition (all under the same project with just one arg in the command line for instance)

Review the examples to unify the metrics logged

yep

Centralise all generated data in a specified dir I guess?

What do you mean?
For now I think we can do things in house and share the results using wandb API between us, in the future a public display would be great!

right? and then we manually run the script whenever we want to check all examples work as expected and verify that by visual inspection in wandb.

Yes all good

Also we have to check that wandb has the latest commit registered, that would be super useful for later (we could even put that commit in the name of the project??)

albertbou92 · 2024-01-20T14:59:22Z

#1822

init

402c339

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 9, 2024

vmoens added the CI Has to do with CI setup (e.g. wheels & builds, tests...) label Jan 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP, CI] Pre-release submitit scripts #1782

[WIP, CI] Pre-release submitit scripts #1782

vmoens commented Jan 9, 2024 •

edited

pytorch-bot bot commented Jan 9, 2024 •

edited

matteobettini commented Jan 9, 2024

github-actions bot commented Jan 9, 2024

github-actions bot commented Jan 9, 2024

vmoens commented Jan 16, 2024

albertbou92 commented Jan 16, 2024 •

edited

vmoens commented Jan 17, 2024

albertbou92 commented Jan 20, 2024

[WIP, CI] Pre-release submitit scripts #1782

Are you sure you want to change the base?

[WIP, CI] Pre-release submitit scripts #1782

Conversation

vmoens commented Jan 9, 2024 • edited

Description

pytorch-bot bot commented Jan 9, 2024 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1782

✅ You can merge normally! (4 Unrelated Failures)

matteobettini commented Jan 9, 2024

github-actions bot commented Jan 9, 2024

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 89. Improved: $\large\color{#35bf28}1$. Worsened: $\large\color{#d91a1a}7$.

github-actions bot commented Jan 9, 2024

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 92. Improved: $\large\color{#35bf28}4$. Worsened: $\large\color{#d91a1a}3$.

vmoens commented Jan 16, 2024

albertbou92 commented Jan 16, 2024 • edited

vmoens commented Jan 17, 2024

albertbou92 commented Jan 20, 2024

vmoens commented Jan 9, 2024 •

edited

pytorch-bot bot commented Jan 9, 2024 •

edited

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

albertbou92 commented Jan 16, 2024 •

edited