Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
Blake Milner committed Nov 15, 2014
1 parent 5681da7 commit d07c435
Showing 1 changed file with 72 additions and 1 deletion.
73 changes: 72 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,75 @@
DeepQLearning
=============

A powerful machine learning algorithm utilizing Q-Learning and Neural Networks, implemented using Torch and Lua.
Written by Blake Milner and Jeff Soldate, with help from Eugenio Culurciello and his lab. Work was
done as part of a project for BME495, a Computational Neuroscience course at Purdue. The original
code, written in JavaScript, was developed by Andrej Karpathy, a Ph.D. student at Stanford University.

A powerful machine learning algorithm utilizing Q-Learning and Neural Networks, implemented using Torch
and Lua.

In many practical engineering scenarios it is often necessary for an algorithm to perform a
series of decisions in order to accomplish a given task. However, that task itself is not always
well-defined and the intermediate decisions to accomplish it are often complex and ever-changing.
Furthermore, information that contributes to accomplishing the task is often not readily available
until critical intermediate decisions have already been made. Video games are a good example of
situations in which a series of actions is required in order to accomplish a task. In recent years
very robust algorithms utilizing these concepts have been developed and applied successfully to retro
Atari video games: http://arxiv.org/pdf/1312.5602v1.pdf.

Reinforcement learning methods that encourage both exploration and strategizing have been developed in
order to address this problem. One of these methods, called Q-learning, utilizes a policy in order to
select an optimal action.

The Q-learning algorithm hinges on utility function called the Q-function. This function
accepts a state that contains all pertinent information about the playing field along with a possible
action. The function returns a number that describes the utility of that action. In Q-learning the utility
of an action is evaluated based on the immediate reward gained from taking that action and the
possibility of a delayed reward that the action may lead to. For large games with many states and possible
actions the above approach is very time-consuming and computationally intense. Using a neural network to
represent the Q-function can solve many of these issues by eliminating the need for enumeration for complete
exploration of the state space.

An implementation of the method described above (written in JavaScript) exists and is freely available:
http://cs.stanford.edu/people/karpathy/convnetjs/demo/rldemo.html

However, this package is designed for a browser and used primarily as a learning tool. DeepQLearning is a
partial port of the Q-learning component of this package to the Lua scripting language. The Neural Network
component is powered by Torch 7, a scientific computing framework used for machine learning. It is the hope
of the authors that this package can be used to fuel further scientific inquiry into this topic.


Installation and Use
====================

Requirements:

* Torch7 (with nnx and optim package)
-- A scientific computing framework with wide support for machine learning algorithms. (https://github.com/torch/torch7)


Usage:

The DeepQLearning module can be easily included in a Lua scipt using:

```bash
Brain = require 'deepqlearn'
```

The brain must then be initialized with the number of expected inputs and outputs:

```bash
Brain.init(num_inputs, num_outputs)
```

An action can be selected from an input state space using:

```bash
action = Brain.forward(state);
```

Learning can be effected from the last state space input to Brian.forward by giving a reward value:

```bash
Brain.backward(reward);
```

0 comments on commit d07c435

Please sign in to comment.