base
: Trial parameters.seed
: RNG Seeds.eval_freq
: Number of training timesteps between two evaluations (per trial).num_timesteps
: Number of total training timesteps (per trial).
agent
: Agent parameters.name
: Method (A2C, PPO, DQN, or QRDQN).discount
: Discount factor (gamma).alpha
: Learning step size (alpha).
environment
: Environment Parameters.map_name
: Name of the target Map.data_dir
: Directory where environment data is stored.start_state
: Start state.goal_states
: List of goal states.crosswalk_states
: List of stochastic crosswalk states.
policy
: Agent network structure (currently only "CnnPolicy" is available).save_dir
: Directory where experiment data is stored.