Skip to content

microcosmAI/MuJoCo-Instruction-Following-Thesis

Repository files navigation

Implementation of a Language Grounding Experiment with microcosm.ai

This is the implementation part of the bachelors thesis Natural Language Instruction-Following in a Simulated 3D World using microcosm.ai, submitted to Osnabrück University in March 2024. It contains code for generating a curriculum, training an agent, and testing it. The experiment implemented, and major parts of the model used, stem from Chaplot et al., 2017 [2]. The code is based on microcosm.ai [1] packages that are still in development, which may cause issues with compatibility. The Issues section expands on this.

Table of Contents:

Features

  • Asynchronous Advantage Actor-Critic (A3C) with LSTM and Gated Attention (GA): The repository implements the A3C algorithm with LSTM and GA.
  • Multiprocessing: We use Python's multiprocessing module to train or test the model across multiple processes simultaneously.
  • Flexible Training and Testing: The script can be used for both training and testing the model. The mode is determined by the --evaluate argument. Other parameters, such as the learning rate, discount factor, or number of training processes, can also be specified from the command line.
  • Model Loading: The script can load a pre-trained model from a specified path, allowing the resumption of training across levels, and evaluation after checkpoints.
  • Curriculum Learning: The script supports curriculum learning, where the model is trained on a series of tasks of increasing difficulty. The curriculum is defined in a specified directory.
  • Environment Visualization: The script includes an option to visualize the environment. This can be useful for debugging and understanding the model's behavior. Note that this does not work during multiprocessing.
  • Logging: We use Tensorboard for logging and visualisation.

Installation

  1. Copy Repository to Local Device:

    • Clone the repository to your local device using Git:
      git clone <https://github.com/microcosmAI/instruction-following>
      
  2. Install Conda Environment from environment.yml:

    • Navigate to the root directory of the cloned repository.
    • Run the following command to create a Conda environment from the provided environment.yml file:
      conda env create -f environment.yml
      
  3. Download and Install mujoco-environment Repository:

    • Download and install the s.mujoco-environment repository according to the installation instructions provided in that repository. The required version may be on the dev branch. You can find the installation instructions here.
  4. PITA Algorithm:

    • A version of the PITA algorithm is included in this repository. However, depending on your system and use case, you may need to install a more up-to-date version, which could require adaptations to the provided code. PITA can be found here.

Usage

  • To train an agent on a curriculum, the curriculum needs to be generated by running python curriculum_generation.py
  • This will generate a set of 6 levels at the default curriculum repository. Depending on your system, this may take a while (our testing has shown Linux to be faster than Windows for this). If you want more or fewer levels, you will need to adjust the settings in the curriculum_generation python file.
  • You can set the agent to train by calling python a3c_main.py, which will use the default parameters. You can adjust the number of processes with the num-processes flag.
  • To test a trained model, wait until training has saved a model checkpoint, terminate training or wait for it to finish, and run a3c_main.py with the -e=1 flag.

Contact

You can reach me at [email protected]

Issues

  • Development was initially done using Ubuntu, but had to be ported to Windows because of incompatibility of an image process during multiprocessing with Linux' X server. At the time, the program would run under Ubuntu if used with only one process, but since development has progressed since then, we can not guarantee compatibility.
  • Testing on Apple devices has shown that sensor data (e.g. camera resolution) may have to be defined differently. We have also observed some further incompatibilities.
  • mujoco-env and PITA are both taken from their respective development branches, which has occasionally been an issue when using PITA output for mujoco-env input. Their versions are also very much subject to change.
  • In summary, this project incorporates experimental packages, and is highly OS-dependant. If you have questions or issues, it is recommended to add them to this repository, or to contact me.

References

[1] Mayer, J. (2024). microcosm.ai. Retrieved March 19, 2024, from https://microcosm.ai/ [2] Chaplot, D. S., Sathyendra, K. M., Pasumarthi, R. K., Rajagopal, D., & Salakhutdinov, R. (2018). Gated-Attention Architectures for Task-Oriented Language Grounding. ArXiv. Link

About

A bachelor thesis on instruction following in MuJoCo.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages