Deep Q-Learning

Deep reinforcement learning using a deep Q-network with a dueling architecture written in TensorFlow.

This AI does not rely on hand-engineered rules or features. Instead, it masters the environment by looking at raw pixels and learning from experience, just as humans do.

Dependencies

NumPy
OpenAI Gym 0.8
Pillow
SciPy
TensorFlow 1.0

Learning Environment

Uses environments provided by OpenAI Gym.

Preprocessing

Each frame is transformed into a 48×48×3 image with 32-bit float values between 0 and 1. No image cropping is performed. Reward signals are restricted to -1, 0 and 1.

Network Architecture

The input layer consists of a 48×48×3 image.

The first hidden layer convolves 64 filters of size 4×4 and stride 2, followed by a rectifier nonlinearity.

The second hidden layer convolves 64 filters of size 3×3 and stride 2, followed by another rectifier nonlinearity.

The third hidden layer convolves 64 filters of size 3×3 and stride 1, followed by another rectifier nonlinearity.

When using a dueling architecture, the network diverges into two streams – one computes the advantage of each possible action, the other the state value.

The advantage stream consists of a fully-connected layer with 512 rectified linear units, feeding into as many output nodes as there are actions.
The state value stream consists of a fully-connected layer with 512 rectified linear units, feeding into a single output node.
The two streams merge and form the output layer. Each output node represents the expected utility of an action.

If a dueling architecture is not used:

The last hidden layer consists of a fully-connected layer with 512 rectified linear units.
The output layer has as many nodes as there are actions. Each output node represents the expected utility of an action.

Acknowledgements

Heavily influenced by DeepMind's seminal paper 'Playing Atari with Deep Reinforcement Learning' (Mnih et al., 2013) and 'Human-level control through deep reinforcement learning' (Mnih et al., 2015).

Uses double Q-learning as described in Deep Reinforcement Learning with Double Q-learning.

Uses the dueling architecture described in Dueling Network Architectures for Deep Reinforcement Learning.

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
.gitignore		.gitignore
README.md		README.md
agent.py		agent.py
dqn.py		dqn.py
environment.py		environment.py
setup.py		setup.py
test_agent.py		test_agent.py
train_agent.py		train_agent.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

README.md

README.md

agent.py

agent.py

dqn.py

dqn.py

environment.py

environment.py

setup.py

setup.py

test_agent.py

test_agent.py

train_agent.py

train_agent.py

Repository files navigation

Deep Q-Learning

Dependencies

Learning Environment

Preprocessing

Network Architecture

Acknowledgements

About

Releases

Packages

Languages

andreimuntean/Deep-Q-Learning

Folders and files

Latest commit

History

Repository files navigation

Deep Q-Learning

Dependencies

Learning Environment

Preprocessing

Network Architecture

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Languages