This repository contains the codes for configuring, training, evaluating and tuning the models of imitation learning. Make sure your computer has NVIDIA graphics card (memory less than 16G may not be able to train most of the models) and the nvidia-smi
command is ready (driver installed).
It is recomended to use anaconda to manage python environments. You can download and install it by running the following commands(if download very slowly, you can click here to download manually):
wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/Miniconda3-py38_4.9.2-Linux-x86_64.sh
chmod u+x Miniconda3-py38_4.9.2-Linux-x86_64.sh && ./Miniconda3-py38_4.9.2-Linux-x86_64.sh
Restart your terminal and you can now use conda:
conda config --set auto_activate_base false && conda deactivate
configurations
Configuration files for demonstration, replay, tasks training and evaluatingdata_process
Tools to process dataraw_to_hdf5.ipynb
Examples for converting raw airbot_play data to hdf5 data for trainingtest_convert_mmk2.ipynb
Examples for converting mmk2 raw data to hdf5 data for trainingtest_convert_mujoco.ipynb
Examples for converting airbot mujoco raw data to hdf5 data for trainingconvert_all.py
Tools to process raw data for trainingaugment_hdf5_images.py
Pipline of augmenting images from the hdf5 file
policy_train.py
Policy training: ACT and yourspolicy_evaluate
Policy evaluating/inferencing: ACT and yourspolicies
common
Utils for all policies.traditional
Traditional policies implementation: cnnmlpact
&diffusion
Policy implementation: ACT, Diffusion Policyonnx
Policy by loading a onnx modelckpt2onnx
Example of converting ckpt file to onnx fileonnx_policy.py
Load a onnx model as the policy
detr
Model definitions modified from policies.common.detr: ACT, CNNMLPenvs
Environments forpolicy_evaluate
: common and AIRBOT Play (real, mujoco, mmk)images
Images used by README.mdconda_env.yaml
Used by conda creating env (now requirements.txt is recommend)requirements
Used for pip install required packagesutils.py
Utils such as data loading and helper functionsvisualize_episodes.py
Save videos from a .hdf5 datasetrobot_utils.py
Useful robot tools to record images and process dataros_tools.py
Tools for ROSrobots
Robots classes used by the envscommon_robot.py
Example and a fake robotros_robots
ros_robot_config.py
Used to configure the ros robotsros1_robot.py
General ROS1 robot class used to control the robotsros2_robot.py
General ROS2 robot class used to control the robots
It is recommended to use a conda python environment. If you do not have one, create and activate it by using the following commands:
conda create -n imitall python=3.8.10 && conda activate imitall
Install the necessary packages by running the following commands:
pip install -r requirements/train_eval.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
What's more, for policy evaluation, make sure you have set up the robot control environment for both software and hardware, such as AIRBOT Play, TOK2, MMK2 and so on.
Before training or inference, parameter configuration is necessary. Create a Python file in the ./configurations/task_configs directory with the same name as the task ( not recommended to modify or rename the example_task.py file directly) to configure the task. This configuration mainly involves modifying various paths (using the replace_task_name function to use default paths or manually specifying paths), camera names (camera_names), robot number (robot_num, set to 2 for dual-arm tasks), and so on. Below is an example from example_task.py, which demonstrates how to modify configs based on the default configuration in template.py without rewriting everything (for more adjustable configurations, refer to ./configurations/task_configs/template.py):
When training with default paths, place the .hdf5 data files in the ./data/hdf5/<task_name> folder. You can create the directory with the following command:
mkdir -p data/hdf5
You can then copy the data manually or using a command like this (remember to modify the paths in the command):
cp path/to/your/task/hdf5_file data/hdf5
Please complete Installation and Parameter Configuration first (training with at least 2 data instances is required; otherwise, an error will occur due to the inability to split the training and validation sets).
Navigate to the repo folder and activate the Conda environment:
conda activate imitall
Then run the training command:
python3 policy_train.py -tn example_task
The above commands, with just -tn
args, will use the configurations from the .py
config files in the configurations/task_configs
folder corresponding to the given task name. If you use command-line parameters (not all parameters support command-line configuration, use -h
arg to show all supported args), they will override the configurations in the config file. This allows for temporary parameter changes but is not recommended for regular use.
After training, by default, you can find two folders in ./my_ckpt/<task_name>/<time_stamp>
directory. The ckpt
folder contains all weight files (referred to as the process folder), while the folder with the same name as <task_name>
(called the core folder) contains the following files:
- Final weights and optimal weights:
policy_best.ckpt
andpolicy_last.ckpt
respectively. - Statistical data:
dataset_stats.pkl
. - Crucial training information (including initial joint angles, training parameter configurations, etc.):
key_info.pkl
. - The training loss curves:
train_val_kl_seed_0.png
,train_val_l1_seed_0.png
andtrain_val_loss_seed_0.png
. - The simple description of the training result:
description.txt
, such asBest ckpt: val loss 0.174929 @ epoch9499 with seed 0
.
For ease of use in the future, it's recommended to store the core folder in the specified disk's IMITALL/my_ckpt folder.
Make sure you have installed the required dependencies for controlling your robots in simulation or reality. The following example shows how to use a AIRBOT Play robotic arm to evaluate a policy.
- First, unplug both the teaching arm and execution arm's USB interfaces to refresh the CAN interface. Then, only connect the execution arm's USB interface (this way, the execution arm will use CAN0).
- Connect the cameras in the same order as that of data collection and so if you haven't unplugged them since data collection, you can skip this step.
- Long-press the power button of each robotic arm to turn them on.
Navigate to the repo folder and activate the conda environment:
conda activate imitall
Here are the evaluation command and its parameters:
python3 policy_evaluate.py -tn example_task -ci 0 -ts 20240322-194244
- -ci: Camera device numbers, corresponding to the device order of the configured camera names. For example, if two cameras are used and their id are 2 and 4, specify
-ci 2 4
. - -ts: Timestamp corresponding to the task (check the path where policy training results are saved, e.g.,
./my_ckpt/example_task/20240325-153007
). - -can: Specify which CAN to use for control; default is CAN0. Change to CAN1 with -can can1, for example. For dual-arm tasks, specify multiple cans like
-can can0 can1
. - -cki: Don't start the robotic arm, only show captured camera images, useful for verifying if the camera order matches the data collection order.
After the robotic arm starts and moves to the initial pose defined by the task, you can see the instructions in the terminal: press Enter to start inference and press z and then press Enter to end inference and exit. The robotic arm will return to the zero pose before the program exiting.
After each evaluation, you can find evaluation-related files (including process videos) in the corresponding timestamp folder inside the eval_results folder in the current directory.
After policy training, key information and dataset stats will be stored in the key_info.pkl file and dataset_stats.pkl, which can be viewed using the following steps.
Navigate to the repo folder and activate the conda environment:
conda activate imitall
Then, use the following command to view information for a specified timestamp:
python3 policy_train.py -tn example_task -ts 20240420-214215 -in key_info
You will see key information related to that task in the terminal, including:
This includes the absolute path to the HDF5 data used during training, training parameter configurations, initial joint values of the first episode for inference, and other information.
This information ensures experiment reproducibility. If the camera is rigidly attached to the robotic arm, replicating the robotic arm's behavior is relatively straightforward. Object placement can be determined through retraining data replication.
For dataset stats, just set -in stats
in the above command.
Refer to README_DataCollection.md