A multi-objective multi-agent reinforcement learning library centered around asymmetric coevolution in PyTorch
Features:
- Total algorithm customization from .yaml files.
- Total integration with Ray library for highly scalable training and inference.
- Fast and stable distributed implementation of Multi-objective Asymmetric Island Model (MO-AIM) proposed by Dixit and Tumer (2023).
- Integration with Heterogeneous-Agent Reinforcement Learning (HARL) proposed by Zhong et al. (2024).
- Baseline implementations for PPO in RLlib and CleanRL.
- Distributed plotting and visualization for supported environments.
- Native filesystem-based multi-agent checkpoints. Navigate archipelago checkpoints entirely in a file browser.
Prerequisites:
- swig (e.g.,
apt install swig
on debian variants) - Python3 header files (e.g,.
apt install python3-dev
) - uv
Simply run
./startup.sh
This will create the virtual environment at .venv
, then just prefix your commands with uv run
or uv pip
, alternatively alias the following:
alias python="uv run --frozen python"
alias pip="uv pip"
Some common workloads may be found in scripts/launch
.
To check that everything is working with the simple MPE simple tag environment, you can run:
./scripts/launch/cleanrl/moaim/test_simple_tag.sh
This launcher will invoke the python script scripts/training/cleanrl/train_moaim_ppo.py
and use an example .yaml configuration for simple tag at config/simple_tag/test_algatross.yml
. An experiment run will be created at experiments/
, which you can then reference as a checkpoint to resume later or generate visualizations.
You can load in a checkpoint by creating a configuration file with the key checkpoint_folder
pointing to the experiment created in the training run (e.g., experiments/c53d9dd7
). You may also override some other configuration settings, such as number of epochs and number of iterations. As an example, you would modify checkpoint_folder
in config/simple_tag/test_algatross_ckpt.yml
with the experiment you created in the MWE, then simply run
./scripts/launch/cleanrl/moaim/test_simple_tag_ckpt.sh
You may also resume from a specific epoch by setting resume_epoch
in the .yaml configuration file. By default it will automatically grab the latest epoch.
You can visualize your trained multi-agent team on each island with the island visualization script:
python scripts/visualize/cleanrl/viz_moaim_island.py experiment_config [num_episodes]
The only requirement is that your experiment configuration contains an entry for checkpoint_folder
.
Environment | Supported | Tested | Special Install Instructions |
---|---|---|---|
MPE Simple Tag | Yes | Yes | None! |
MPE Simple Spread | Yes | Yes | None! |
SMACv2 | Yes | No | None! |
STARS MAInspection | Yes | Yes | None! |
Our MO-AIM implementation follows this logical layout:
RayExperiment - algatross/experiments/ray_experiment.py
├─ Main entrypoint (e.g., run_experiment)
├─ Environment registration (from RLlib)
├─ Experiment configuration (a dictionary)
│
└─ [RayActor] MOAIMRayArchipelago - algatross/algorithms/genetic/mo_aim/archipelago/ray_archipelago.py
├─ Entrypoint for archipelago evolution (evolve())
│
├─ [RayActor] PopulationServer - algatross/algorithms/genetic/mo_aim/population.py
│ ├─ MOAIMIslandPopulation for island/mainland 0
│ ├─ ...
│ └─ MOAIMIslandPopulation for island/mainland n
│ ├─ PyRibs Emitters
│ │ └─ Mutation and elites logic
│ ├─ PyRibs Archive
│ ├─ Quality Diversity (QD) logic
│ └─ Rollout buffers
│
└─ Dictionary of islands
├─ [RayActor] IslandServer for island/mainland 0 - algatross/algorithms/genetic/mo_aim/islands/ray_islands.py
├─ ...
└─ [RayActor] IslandServer for island/mainland n ...
├─ Environment (MultiAgentEnv from RLlib)
└─ Island (UDI)
├─ Evolution entrypoint!
├─ Algorithm (UDA)
│ └─ Team evolution logic
└─ Problem (UDP)
├─ Fitness scorer
└─ Environment Runner (Tape machine) - algatross/environments/runners.py
where [RayActor] denotes a new spawned actor in the ray cluster, which is logically treated as a forked thread leveraging the ray cluster network for IPC.
Launch this library in an ACE Hub environment with this link:
Alternative: VPN-only link
Launch the latest development version of this library in an ACE Hub environment with this link:
MO-MARL Development Environment
Alternative: VPN-only link
- Developer Guide: detailed guide for contributing to the MO-MARL repository.
- Message @tgresavage or @wgarcia on Mattermost.
- Troubleshooting FAQ: consult list of frequently asked questions and their answers.
- Mattermost channel(): create a post in the MO-MARL channel for assistance.
- Create a GitLab issue by email()
-
Authorization error accessing https://git.act3-ace.com/api/v4/projects/1287/packages/pypi/simple/corl/
Try running the ACT3 login script (or
act3-pt login
), or try refreshing the access token in your~/.netrc
file.