Skip to content

hurshprasad/RL-easy21

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A game like blackjack except with full replacement and no aces as 1/11's.

Reinforcement Learning approaches below.

Monte Carlo Control

Using GPI for Q optimzation, using time varying scalar step and ε-greedy exploration strategy.

Monte Carl Control V*

TD Learning Sarsa Off Policy Control(λ)

Q*(s,a) = Q(s,a) + α ζet(s,a)

Linear Function Approximation

Q(s, a) = Φ(s, a)Τ θ

Using overlapping Coarse Coding for feature vector Φ overlapping state space with player sum and dealer initial value.