Blackjack-RL: A Reinforcement Learning Exercise for Blackjack

Phase 0

Test against various baseline decision models as benchmarks

dealer's own decision model
random decision model
a standard 'fixed' decision model based on dealer's showing card, player's total, and soft ace count
a standard 'fixed' decision model like the aforementioned that also takes into account the hi-lo card count

In all of the above, the bet amount is fixed (never varies, player always bets the minimum)

Phase 1

Test against a Q-learning trained model that does not count cards, and the bet amount is fixed. Its output is only hit/stand.

Phase 2

Test against a Q-learning trained model that takes the hi-lo card count into account, and the bet amount is fixed. Its output is only hit/stand.

-- we are here --

Phase 3

Test against a Q-learning trained model that takes the hi-lo card count into account, and varies its bet amount accordingly. The model for choosing hit/stand is the same model from Phase 2, and choosing an optimal bet amount will use a new model which will be trained with usage of an already-trained instance of the Phase 2 model.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
api		api
model		model
.gitignore		.gitignore
README.md		README.md
q-learning.ipynb		q-learning.ipynb
results.ipynb		results.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

api

api

model

model

.gitignore

.gitignore

README.md

README.md

q-learning.ipynb

q-learning.ipynb

results.ipynb

results.ipynb

Repository files navigation

Blackjack-RL: A Reinforcement Learning Exercise for Blackjack

Phase 0

Phase 1

Phase 2

Phase 3

About

Releases

Packages

Languages

jrkosinski/blackjack-rl

Folders and files

Latest commit

History

Repository files navigation

Blackjack-RL: A Reinforcement Learning Exercise for Blackjack

Phase 0

Phase 1

Phase 2

Phase 3

About

Topics

Resources

Stars

Watchers

Forks

Languages