SAC with Q-Ensemble for Offline RL

Single-file SAC-N [1] implementation on jax with both flax and equinox. 10x faster than SAC-N on pytorch from CORL [2].

And still easy to use and understand! To run:

python sac_n_jax_flax.py --env_name="halfcheetah-medium-v2" --num_critics=10 --batch_size=256
python sac_n_jax_eqx.py --env_name="halfcheetah-medium-v2" --num_critics=10 --batch_size=256

Optionally, you can pass --config_path to the yaml file, for more see pyrallis docs.

Speed comparison

Main insight here is to jit epoch loop also with jax.lax.fori_loop or jax.lax.scan, not just one update of the networks, as it is usually done (jaxrl2 for instance). With jitting the update only speedup will be approx 1.5x here.

Both runs were trained on same V100 GPU.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
sac_n_jax_eqx.py		sac_n_jax_eqx.py
sac_n_jax_flax.py		sac_n_jax_flax.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

sac_n_jax_eqx.py

sac_n_jax_eqx.py

sac_n_jax_flax.py

sac_n_jax_flax.py

Repository files navigation

SAC with Q-Ensemble for Offline RL

Speed comparison

References

About

Releases

Packages

Languages

License

Howuhh/sac-n-jax

Folders and files

Latest commit

History

Repository files navigation

SAC with Q-Ensemble for Offline RL

Speed comparison

References

About

Topics

Resources

License

Stars

Watchers

Forks

Languages