Skip to content

Play Rock, Paper, Scissors (Kaggle competition) with Reinforcement Learning: bandits, tabular Q-learning and PPO with LSTM.

Notifications You must be signed in to change notification settings

alxthm/rld-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Playing Rock, Paper, Scissors with Reinforcement Learning

This is an attempt to apply RL algorithms at the Rock, Paper, Scissors (RPS) Kaggle competition.

.
├── dojo
│     ├── black_belt
│     ├── blue_belt
│     ├── white_belt
│     └── my_little_dojo
├── runs
├── eval.py
├── ppo_model.py
├── ppo_train.py
└── test.py

Algorithms

We implemented and evaluated our algorithms against a number of baselines from [1] (white, blue and black belt baselines).

Our agents are located in the dojo/my_little_dojo folder:

  • Simple Multi-armed-bandits: jotaro.py
  • Bandit of bandits (Meta-bandits/Banditception): giorno.py
  • Tabular Q-learning: saitama.py
  • PPO-LSTM (considering a POMDP setting): levi.py

Note: Unlike other agents, the PPO-LSTM agent does not learn a strategy online from scratch, but is pre-trained against a number of white belt baselines offline. Weights of pre-trained agents are located in the runs folder.

How to run

Our implementation uses PyTorch and, in addition to common python data libraries, you may need to install the kaggle-environments library.

To evaluate an agent against the other baselines, run the eval.py specifying the agent name and the baseline against which you want to evaluate it.

I you want to run an algorithm in debug mode (not possible with the default Kaggle environment), you can use the test.py file, which reproduces the RPS game. This can be useful if eval.py fails silently (agent returning None), but please make sure you have all required libraries installed (torch, tensorboard, etc).

Run ppo_train.py if you want (specifying one or multiple opponents) to train the PPO-LSTM agent.

References

[1] RPS Dojo Kaggle Notebook (https://www.kaggle.com/chankhavu/rps-dojo)

About

Play Rock, Paper, Scissors (Kaggle competition) with Reinforcement Learning: bandits, tabular Q-learning and PPO with LSTM.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages