i-DQN
and i-IQN
consider multiple consecutive
NB: For simplicity, this branch does not support parameter sharing between the networks. Checkout the branch experiments for shared network parameters.
We recommend using Python 3.11.5. In the folder where the code is, create a Python virtual environment, activate it, update pip and install the package and its dependencies in editable mode:
python3 -m venv env
source env/bin/activate
pip install --upgrade pip setuptools wheel
pip install -e .[dev,gpu]
To verify the installation, run the tests as:pytest
To launch experiments locally, run:
launch_job/[enviroment]/local_[algorithm].sh [ARGS]
This will launch an experiment in a tmux terminal. It will push the logs and performances to a WandB project called "i-dqn". The list of hyperparameters is available at experiments/base/parser_argument.py.
For example:
launch_job/lunar_lander/local_idqn.sh --experiment_name K3_T200_D10 --first_seed 1 --last_seed 1 --n_networks 3 --target_update_frequency 200 --target_sync_frequency 10
will run i-DQN with
@article{vincent2024iterated,
title={Iterated $ Q $-Network: Beyond the One-Step Bellman Operator},
author={Vincent, Th{\'e}o and Palenicek, Daniel and Belousov, Boris and Peters, Jan and D'Eramo, Carlo},
journal={Transactions on Machine Learning Research},
year={2025}
}