Skip to content

nmruedap/technical_assessment_smarthop

Repository files navigation

Smarthop machine learning position (technical assessment)

Below steps for technical assessment solution, are described:

Training and test agent in lunar lander environment with DQN stable baselines algorithm

  1. Stable baselines module was installed using anaconda to support Tensorflow 1.8.
  2. DQN agent was trained using DQN stable baselines algorithm with learning_rate = 0.003 and time_steps = 500000
  3. After learning stage, agent was tested loading l_lander_dqn.zip file. Below is a gif with recording result

Lunar lander with DQN training agent recorded video

Python file lunar_lander_dqn.py check if previous training model exists (l_lander_dqn.zip). If exists, run test environment with the learning file. If not exists, agent is trained with DQN argorithm (stable baselines).

Customize environment

Environment lunar lander was customized in python file lunar_launcher_env.py, now is named: LunarLauncherEnv. Lunar launcher environment goal is reach the top center of screen environment (between flags). It make random terrains, simulating random launch started angles. If set_random_x_pos=False always will start in ground center (see image below).

set_random_x_pos = False random angles in center screen

If set_random_x_pos=True agent will start in random ground positions along x axis (see image below).

set_random_x_pos = True Random angles and random ground positions along x axis

To allow the performance of an agent in this environment, it was necessary to make some changes, which are described below:

  • Friction was increased to allow a fixed started position
  • Collision ground detector was removed, when agent reach to top, game over
  • Main engine power was decreased to promote use of right and left engines, it force agent to fly to reach correct launch angle and position
  • Observation space was reduced to 6, observations related with contact ground detection were removed
  • Rewards system was changed, to promote movement and reach the center top

Environment lunar launcher was tested in python file using_custom_env_lunarlauncher.py through DQN algorithm. Below is a gif with recording result:

Lunar launcher with DQN training agent recorded video

Folder gym-lunarlauncher was added to install environment using pip3 install -e gym-lunarlauncher, however, I haven't tested yet because I using anaconda virtual python, but I hope it works.

About

Smarthop machine learning position (technical assessment)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages