Awesome Reinforcement Learning

Click here to see icon descriptions.

🚀 - state-of-the-art agent/technique at the moment of paper publication.
⭐ - valuable paper.
- Model-based RL.
- Multi-Agent RL.
- Self-Play.
- Evolutionary & Genetic Algorithms.
- Generalization on unseen environments.
- Auto ML - Architecture search.
- Manipulation tasks.
- Locomotion: MuJoCo, Roboschool, etc.
- Navigation tasks.
- Strategy Planning Problems.
- Transfer learning.
- Inverse Reinforcement Learning.
- Meta-Learning.
- Curiosity Learning, Advanced Exploration.
- Table games (Table).
- Atari game (Atari).
- Doom game (Doom).
- Starcraft game (Starcraft).
- Go game (Go).

RL Frameworks & Implementations

[Stable Baselines3] PyTorch: MaskablePPO, PPO, A2C, DQN, etc

[Baselines @ OpenAI] TensorFlow: PPO, A2C, DQN, TRPO, ACKTR, DDPG, HER, GAIL, etc

[Baselines @ DLR-RM] Pytorch: Custom envs, custom policies

[RLlib @ Ray Pytorch / TensorFlow]

[Dopamine @ Google] TensorFlow: Rainbow, c51, IQN, DQN, etc

[TensorForce] TensorFlow: A3C, PPO, TRPO, DQN, etc

[pytorch-a2c-ppo-acktr] PyTorch: A2C, ACKTR, PPO, GAIL, etc

RL Benchmarks

[OpenAI Benchmarks for PPO, A2C, ACKTR, ACER]

[OpenAI Benchmarks for DQN, Double DQN, Dueling DQN, Prioritized DQN]

[Google Benchmarks for Rainbow, c51, IQN, DQN]

Policy-Based Generic Agents

🚀 [Soft Actor Critic] [blog] [code] 2018 @ Google Brain, UC Berkeley

🚀 [IMPALA] 2018 @ Uber AI Labs

🚀 [Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR, A2C)] 2018; Univ. of Toronto, New York Univ.

🚀 [Proximal Policy Optimization Algorithms (PPO)] [blog] 2017 @ OpenAI

🚀 📝 Notes [Asynchronous Methods for Deep Reinforcement Learning (A3C)] 2016 @ Google Deepmind

[High-dimensional continuous control using generalized advantage estimation (GAE)] 2015 @ Berkeley

⭐ [Trust Region Policy Optimization (TRPO)] 2015 @ UC Berkeley

⭐ [Actor-Critic Algorithms, pdf] Konda and Tsitsiklis, 2003

⭐ [Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning (REINFORCE), pdf] Ronald J. Williams, 1992 @ Northeastern Univ.

Value-Based Generic Agents

🚀 [Implicit Quantile Networks for Distributional Reinforcement Learning (IQN)] Dabney et al., 2018 @ Google Deepmind

🚀 [A Distributional Perspective on Reinforcement Learning (c51)] Bellemare et al., 2018 @ Google Deepmind

🚀 [Rainbow: Combining Improvements in Deep Reinforcement Learning] Hessel et al., 2017 @ Google Deepmind

🚀 [Dueling Network Architectures for Deep Reinforcement Learning (Dueling DQN)] Wang et al., 2015 @ Google Deepmind

🚀 📝 Notes [Prioritized Experience Replay] Schaul et al., 2015 @ Google Deepmind

🚀 [Deep Reinforcement Learning with Double Q-learning (Double DQN)] Hasselt et al., 2015 @ Google Deepmind

🚀 📝 Notes [Human-level control through deep reinforcement learning (DQN)] [pdf] Mnih et al., 2015 @ Google Deepmind

🚀 [Playing Atari with Deep Reinforcement Learning** (DQN)] Mnih et al., 2013 @ DeepMind Technologies

⭐ [Temporal Difference Learning and TD-Gammon, pdf] Gerald Tesauro, 1995

Model-Based Generic Agents

[Model-Based Reinforcement Learning for Atari] 2019 @ Google Brain, etc

⭐ [World Models] [blog] 2018 @ IDSIA, Google Brain, NNAISENSE

[Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning] [blog] [code] 2017 @ Berkeley

[Learning model-based planning from scratch], [blog] 2017 @ Google DeepMind

[The Predictron: End-To-End Learning and Planning] 2016 @ Google Deepmind

Evolutionary Algorithms

[Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari] 2018 @ Univ. of Freiburg

⭐ [Deep Neuroevolution] 2017 @ Uber AI Labs

⭐ [Evolution Strategies as a Scalable Alternative to Reinforcement Learning] 2017 @ OpenAI

[Evolving Large-Scale Neural Networks for Vision-Based Reinforcement Learning, pdf] 2013 @ IDSIA, USI-SUPSI

Exploration

🚀 [Go-Explore] 2019 @ Uber AI Labs

[Exploration by Random Network Distillation (RND)] [blog] [code] 2018 @ OpenAI

[Large-Scale Study of Curiosity-Driven Learning] [blog] 2018 @ OpenAI, Berkeley, Univ. of Edinburgh

⭐ [RUDDER: Return Decomposition for Delayed Rewards] [code] 2018 @ Johannes Kepler Univ. Linz

[Deep Curiosity Search] 2018 @ Univ. of Wyoming

[Parameter Space Noise for Exploration] 2017 @ OpenAI, Karlsruhe Inst. of Tech.

⭐ [Imagination-Augmented Agents for Deep Reinforcement Learning (I2As)] [blog] 2017 @ DeepMind

Self-Play

⭐ [Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm] Silver et al., 2017 @ Google Deepmind

⭐ [Mastering the Game of Go without Human Knowledge (AlphaGo Zero), pdf], [blog] Silver et al., 2017 @ Deepmind

[Mastering the game of Go with deep neural networks and tree search (AlphaGo Master)], [reddit] Silver et al., 2017 @ Deepmind, Google

Meta-Learning

[Meta Learning Shared Hierarchies] [blog] Frans et al., 2017 @ OpenAI, Berkeley.

[Hybrid Reward Architecture for Reinforcement Learning (HRA)] van Seijen et al., 2017 @ Microsoft Maluuba, McGill Univ.

Multi-Agent RL

[Learning with Opponent-Learning Awareness (LOLA)] [blog] Foerster et al., 2017 @ OpenAI, Oxford, Berkeley, CMU

Inverse RL

[SFV: Reinforcement Learning of Physical Skills from Videos] [blog] Peng et al., 2018; Berkeley

[One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning] Finn et al., 2018 @ UC Berkeley

[One-Shot Visual Imitation Learning via Meta-Learning] Finn et al., 2017 @ UC Berkeley, OpenAI

Navigation

[Learning to Navigate in Cities Without a Map] Mirowski et al, 2019 @ Deepmind

[Human-level performance in first-person multiplayer games with population-based deep reinforcement learning] [blog] Jaderberg et al, 2018 @ DeepMind

[Building Generalizable Agents with a Realistic and Rich 3D Environment] Wu et al, 2018 @ Berkeley, FAIR

🚀 [Learning to Navigate in Complex Environments] Mirowski et al., 2017 @ Deepmind

Distral: Robust Multitask Reinforcement Learning] Teh et al, 2017 @ Deepmind

[RL²: Fast Reinforcement Learning via Slow Reinforcement Learning] Duan et al., 2016 @ Berkeley, OpenAI

⭐ 📝 Notes [Reinforcement Learning with unsupervised auxiliary tasks (UNREAL)] Jaderberg et al., 2016 @ Google DeepMind

🚀 [Learning to act by predicting the future (VizDoom 2016 Full DM Winner)] Dosovitskiy, Koltun, 2016 @ Intel Labs

[Playing FPS Games with Deep Reinforcement Learning (VizDoom 2016 Limited DM 2nd place)] Lample, Chaplot, 2016 @ CMU

Manipulation

[Learning Dexterous In-Hand Manipulation] [blog] Andrychowicz et al., 2018 @ OpenAI

[Asymmetric Actor Critic for Image-Based Robot Learning] [blog] Pinto et al., 2017 @ OpenAI, CMU

[Sim-to-Real Transfer of Robotic Control with Dynamics Randomization], [blog] Peng et al., 2017 @ OpenAI, Berkeley

Locomotion

[Emergence of Locomotion Behaviours in Rich Environments] [blog] Heess et al., 2017 @ DeepMind

[Programmable Agents] Denil et al., 2017 @ Google Deepmind

Auto ML

[AutoAugment: Learning Augmentation Policies from Data] Cubuk et al., 2018 @ Google Brain

⭐ [Regularized Evolution for Image Classifier Architecture Search] Real et al., 2018 @ Google Brain

⭐ [Learning Transferable Architectures for Scalable Image Recognition] Zoph et al., 2017 @ Google Brain

[Neural Optimizer Search with Reinforcement Learning, pdf] Bello et al., 2017 @ Google Brain

[Neural Architecture Search with Reinforcement Learning] B. Zoph and Quoc V. Le, 2016 @ Google Brain

Other Domains

[A Deep Reinforcement Learning Chatbot] Serban et al., 2017 @ MILA

Books

⭐ [Reinforcement Learning: An Introduction, pdf] Richard S. Sutton and Andrew G. Barto, 2018

Search for new Papers

[A Brief Survey of Deep Reinforcement Learning] Arulkumaran et al., 2017

Another Awesome Deep RL list: https://github.com/tigerneil/awesome-deep-rl

Awesome Offline RL: https://github.com/hanjuku-kaso/awesome-offline-rl

ArXiv Sanity Preserver: http://www.arxiv-sanity.com/

GitXiv: http://www.gitxiv.com/

Misc

[How to Read a Paper] S. Keshav, 2007 @ Univ. of Waterloo

[Transfromers: Attention is all you need] Vaswani et al. 2017 @ Google Brain/Research

Name		Name	Last commit message	Last commit date
Latest commit History 138 Commits
assets		assets
icons		icons
notes		notes
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets

assets

icons

icons

notes

notes

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Awesome Reinforcement Learning

Table of Contents

RL Frameworks & Implementations

RL Benchmarks

Policy-Based Generic Agents

Value-Based Generic Agents

Model-Based Generic Agents

Evolutionary Algorithms

Exploration

Self-Play

Meta-Learning

Multi-Agent RL

Inverse RL

Navigation

Manipulation

Locomotion

Auto ML

Other Domains

Books

Search for new Papers

Misc

About

Releases

Packages

License

dbobrenko/awesome-rl

Folders and files

Latest commit

History

Repository files navigation

Awesome Reinforcement Learning

Table of Contents

RL Frameworks & Implementations

RL Benchmarks

Policy-Based Generic Agents

Value-Based Generic Agents

Model-Based Generic Agents

Evolutionary Algorithms

Exploration

Self-Play

Meta-Learning

Multi-Agent RL

Inverse RL

Navigation

Manipulation

Locomotion

Auto ML

Other Domains

Books

Search for new Papers

Misc

About

Topics

Resources

License

Stars

Watchers

Forks