Skip to content

abhishm/pg_rnn_baseline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Baseline for variance reduction in Policy Gradient Algorithm

Modular implementation of Vanila Policy Gradient (VPG) algorithm with baseline using an RNN policy.

Dependencies

Features

  • Using a value function based baseline for reducing the variance in the vanila policy gradient algorithms
  • Using an RNN policy for giving the action probabilities for a reinforcement learning problem
  • Using a sampler that reshape the trajectory to be feed into an RNN policy
  • Using gradient clipping to solve the exploding gradient problem
  • Using GRU to solve the vanishing gradient problem

Usage

To train a model for Cartpole-v0:

$ python test_graph_pg.py 

To view the tensorboard

$tensorboard --logdir .