soft-actor-critic

This repo consists of modifications to the Spinningup implementation of the Soft Actor-Critic algorithm to allow for both image observations and discrete action spaces.

Trained Atari agents (courtesy of https://github.com/yining043):

Dependencies:

tensorflow 1.15.0
gym[atari] 0.15.7
cv2
mpi4py
numpy
matplotlib

Implentations of Soft Actor Critic (SAC) algorithms from:

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor, Haarnoja et al, 2018 https://arxiv.org/abs/1801.01290
Soft Actor-Critic Algorithms and Applications, Haarnoja et al, 2019, https://arxiv.org/abs/1812.05905
Soft Actor Critic for Discrete Action Settings, Petros Christodoulou, 2019, https://arxiv.org/abs/1910.07207 (authors implementation here: https://github.com/p-christ/Deep-Reinforcement-Learning-Algorithms-with-PyTorch)

Based on the implementations given in Spinningup

https://spinningup.openai.com/en/latest/algorithms/sac.html

Different approaches for discrete setting

Two different methods given for using SAC with discrete action spaces.

sac_discrete_gb uses the Gumbel Softmax distribtuion to reparameterize the discrete action space. This keeps algorithm similar to the original SAC implementation for continuous action spaces.
sac_discrete avoids reparmeterisation and calculate the entropy and KL divergence from the discrete actions given by the policy network. This is based on the method described in [3] and is most accurate to the original SAC papers, I also find best results with this method.

Versions of the algorithms that work with image observations such as the atari gym environments are in the image observation directory.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
array_observation		array_observation
image_observation		image_observation
saved_gifs		saved_gifs
saved_models		saved_models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
load_atari_model.py		load_atari_model.py
plot_progress.py		plot_progress.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

array_observation

array_observation

image_observation

image_observation

saved_gifs

saved_gifs

saved_models

saved_models

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

load_atari_model.py

load_atari_model.py

plot_progress.py

plot_progress.py

Repository files navigation

soft-actor-critic

Trained Atari agents (courtesy of https://github.com/yining043):

Dependencies:

Implentations of Soft Actor Critic (SAC) algorithms from:

Based on the implementations given in Spinningup

Different approaches for discrete setting

About

Releases

Packages

Languages

License

ac-93/soft-actor-critic

Folders and files

Latest commit

History

Repository files navigation

soft-actor-critic

Trained Atari agents (courtesy of https://github.com/yining043):

Dependencies:

Implentations of Soft Actor Critic (SAC) algorithms from:

Based on the implementations given in Spinningup

Different approaches for discrete setting

About

Topics

Resources

License

Stars

Watchers

Forks

Languages