Self-balancing bicycle / Autonomous Two Wheel Rover for Martian Surface

An Open Source model of a bicycle agent that self-balances itself and navigates to a certain target location in a Martian Surface Environment.

Background

The hypothetical colonization of Mars has received interest from public space agencies and private corporations, and has received extensive treatment in science fiction writing, film, and art. Reasons for colonizing Mars include curiosity, the potential for humans to provide more in-depth observational research than unmanned rovers, economic interest in its resources, and the possibility that the settlement of other planets could decrease the likelihood of human extinction. Thus for this purpose the first step is the explporation of the martian surface and many space agencies across the world have sent their rovers to mars for this purpose of exploration. Nasa's Perseverance rover is currently one of the best rovers to have been sent to Mars. But by Earth vehicle standards, the Perseverance rover is slow. By Martian vehicle standards, however, Perseverance is a standout performer. The rover has a top speed on flat, hard ground of 4.2-centimeters per second, or 152 meters per hour. This is a little less than 0.1-miles per hour. For comparison, a 3 mile-per-hour walking pace is 134 centimeters per second, or 4,828 meters per hour.

So our objective is to create a bike which is able to explore the Martian Surface faster and remains energy efficient.

Mars Environment Overview

SMALL PLANET: If the Sun were as tall as a typical front door, Earth would be the size of a dime, and Mars would be about as big as an aspirin tablet.
LONGER DAYS: One day on Mars takes a little over 24 hours. Mars makes a complete orbit around the Sun (a year in Martian time) in 687 Earth days.
RUGGED TERRAIN: Mars is a rocky planet. Its solid surface has been altered by volcanoes, impacts, winds, crustal movement and chemical reactions.
ATMOSPHERE: Mars has a thin atmosphere made up mostly of carbon dioxide (CO2), argon (Ar), nitrogen (N2), and a small amount of oxygen and water vapor.
MANY MISSIONS: Several missions have visited this planet, from flybys and orbiters to rovers on the surface.The first true Mars mission success was the Mariner 4 flyby in 1965.
TOUGH PLACE FOR LIFE: At this time, Mars' surface cannot support life as we know it. Current missions are determining Mars' past and future potential for life.
RUSTY PLANET: Mars is known as the Red Planet because iron minerals in the Martian soil oxidize, or rust, causing the soil and atmosphere to look red.

Types of Tasks

Obstacle Avoidance

  In this task our goal is to explore the martian surface while avoiding the obstacles.

Target Following

  In this task our goal is to reach a given target location on Martian Surface autonomously.

Final.Video_2.mp4

Solution

For both of the above task we used a Reinforcement Learning (RL) approach.

Algorithm

Deep Deterministic Policy Gradient (DDPG) is a reinforcement learning technique that combines both Q-learning and Policy gradients. DDPG being an actor-critic technique consists of two models: Actor and Critic. The actor is a policy network that takes the state as input and outputs the exact action (continuous), instead of a probability distribution over actions. The critic is a Q-value network that takes in state and action as input and outputs the Q-value. DDPG is an “off”-policy method. DDPG is used in the continuous action setting and the “deterministic” in DDPG refers to the fact that the actor computes the action directly instead of a probability distribution over actions. DDPG is used in a continuous action setting and is an improvement over the vanilla actor-critic.

Procedure:

Firstly we create an OpenAI gym environment where we can control our agent (Bike) in an environment (Martian Surface).
Now we setup a reward function (R) and the goal of our agent will be to maximize this reward function.
Now we will train our agent to maximize this reward function using Reinforcement Learning algorithms (DDPG algorithm) by using experience of self-play.
Ploting the cummulative reward to see if the agent's training is converging or not.
Saving and testing the model.

Components of RL

Agent: Bike
Environment: Mars surface and obstacles (stones)
Action:
1. The Bike's Handlebar position (i.e the angle at which the handlebar should be moved)
2. The Bike's speed (i.e. Torque on the wheels)
Observation:
1. Postion and Orientation of the Bike.
2. Postion of the Target location.
3. The distance traveled by rays before colliding with an obstacle.
Reward:
1. Positive reward if the Bike reaches the target location.
2. Negative reward if the Bike collides with an obstacle.
3. Negative reward if the Bike gets too far away from the target location.
4. Negative reward if the Bike keeps rotating in a circular path.
5. Negative reward if the Bike takes more than 1000 timesteps to complete the episode.

GUI on Spartificial website

We will be adding this project as an interactive small game on our website very soon.

License

This project is licensed under Apache-2.0 License. For more details, see here

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.idea		.idea
.ipynb_checkpoints		.ipynb_checkpoints
Cycle Balance action_space=1 Weights		Cycle Balance action_space=1 Weights
Only Target Follower Weights		Only Target Follower Weights
Texture		Texture
__pycache__		__pycache__
files		files
DDPG.py		DDPG.py
LICENSE		LICENSE
Obstacle Avoidance + Target Avoidance.py		Obstacle Avoidance + Target Avoidance.py
Obstacle Avoidance + Target Following action_space=2.ipynb		Obstacle Avoidance + Target Following action_space=2.ipynb
Obstacle Avoidance + Target Following.ipynb		Obstacle Avoidance + Target Following.ipynb
Obstacle Avoidance.ipynb		Obstacle Avoidance.ipynb
Obstacle Avoidance.py		Obstacle Avoidance.py
README.md		README.md
Rock.obj		Rock.obj
actor.py		actor.py
bike_2.urdf.xml		bike_2.urdf.xml
critic.py		critic.py
cube.obj		cube.obj
cube.urdf		cube.urdf
env_obstacle.py		env_obstacle.py
env_obstacle_plus_target.py		env_obstacle_plus_target.py
plot_moving_average.py		plot_moving_average.py
requirements.txt		requirements.txt
terrain.obj		terrain.obj
video.avi		video.avi

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Self-balancing bicycle / Autonomous Two Wheel Rover for Martian Surface

Contents

Background

Mars Environment Overview

Types of Tasks

Obstacle Avoidance

Target Following

Solution

Algorithm

Procedure:

Components of RL

GUI on Spartificial website

License

About

Releases 1

Packages

Contributors 2

Languages

License

Spartificial/bike_on_mars

Folders and files

Latest commit

History

Repository files navigation

Self-balancing bicycle / Autonomous Two Wheel Rover for Martian Surface

Contents

Background

Mars Environment Overview

Types of Tasks

Obstacle Avoidance

Target Following

Solution

Algorithm

Procedure:

Components of RL

GUI on Spartificial website

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages