CartPole

Gym ai cart pole environment implemented using deep reinforcement learning with keras

https://gym.openai.com/envs/CartPole-v0/

The Cross-Entropy Method

So, our cross-entropy method is model-free, policy-based, and on-policy, which means the following:

It doesn't build any model of the environment; it just says to the agent what to do at every step
It approximates the policy of the agent
It requires fresh data obtained from the environment

Algorithm

Play N number of episodes using our current model and environment.
Calculate the total reward for every episode and decide on a reward boundary. Usually, we use some percentile of all rewards, such as 50th or 70th.
Throw away all episodes with a reward below the boundary.
Train on the remaining "elite" episodes using observations as the input and issued actions as the desired output.
Repeat from step 1 until we become satisfied with the result.

Limitations

For training, our episodes have to be finite and, preferably, short
The total reward for the episodes should have enough variability to separate good episodes from bad ones
There is no intermediate indication about whether the agent has succeeded or failed

Paper

https://people.smp.uq.edu.au/DirkKroese/ps/eormsCE.pdf

Run

Clone the project
Install the packages.

pipenv install

Execute

pipenv run python card-pole-control.py

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.gitignore		.gitignore
Pipfile		Pipfile
README.md		README.md
batches.py		batches.py
card-pole-control.py		card-pole-control.py
episode.py		episode.py
policy.py		policy.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CartPole

The Cross-Entropy Method

Algorithm

Limitations

Paper

Run

About

Releases

Packages

Languages

wasupu/cart-pole

Folders and files

Latest commit

History

Repository files navigation

CartPole

The Cross-Entropy Method

Algorithm

Limitations

Paper

Run

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages