-
Notifications
You must be signed in to change notification settings - Fork 0
Using the MuJoCo Rl Environment
Here's a breakdown of the arguments used in a reinforcement learning multi-agent environment that utilizes Mujoco:
-
agents
: A list of agents participating in the environment. The names must be equal to the mujoco body which is the top level object of the agent in the environment. -
xmlPath
: The file path to the XML description of the environment. MuJoCo uses XML files to define physical simulations. Users can also hand over a list of XML file paths. Then at eachreset()
one level is selected at random. -
infoJson
: An optional JSON file that provides additional information or configuration for the environment. Here, users can hand over either a path to one file or a list of file paths. If a list is provided, the file names of the JSON files have to match the names of the XML level files. If a list of XML files is provided but only one JSON file, this JSON file is then used for all levels. -
renderMode
: A boolean flag indicating whether to enable rendering of the environment. If set toTrue
, the environment will be visually displayed during the simulation. Be careful when setting this flag to true while using Ray, as each worker will spawn its own environment and start the rendering process. -
exportPath
: The path where the environment can export each frame for usage in the Unreal Engine for better visualization. -
freeJoint
: A boolean flag specifying whether free joints are used for movements in the environment. If set toFalse
, the actuators are used instead. If set toTrue
, the environment will ignore all actuators and only use freeJoint movements. -
skipFrames
: The number of frames to skip between each agent action. This value determines the granularity of the simulation steps. -
maxSteps
: The maximum number of steps or time-steps allowed in the environment before the simulation terminates. If this limit is reached, the simulation ends regardless of the agent's progress. -
rewardFunctions
: A list of reward functions that define the rewards given to the agents based on their actions and the state of the environment. Reward functions are typically designed to encourage desired behavior or achievement of specific goals. -
doneFunctions
: A list of done functions that determine when an episode or simulation should be considered complete or terminated. Done functions are used to define conditions that signify the end of an episode, such as reaching a goal or exceeding a time limit. -
environmentDynamics
: A list of classes which need to incorporate an__init__(self, mujoco_gym)
and adynamic(self, agent, actions)
function. The later one must return a reward and a NumPy array for its observations, in that order. Note that all dynamics also need to specify aself.observation_space = {"low":[], "high":[]}
and aself.action_space = {"low":[], "high":[]}
in its constructor. -
agentCameras
: A boolean flag indicating whether agent-specific cameras should be enabled. If set toTrue
, each agent may have its own camera view within the environment. Note that actually using those cameras creates a huge overhead and decreases performance significantly. -
sensorResolution
: A tuple with the resolution with which camera data is rendered. It always uses three color channels.
These arguments collectively define various aspects of the reinforcement learning multi-agent environment, such as agent configuration, simulation parameters, rendering options, reward and termination conditions, and additional environment dynamics. By configuring these arguments appropriately, you can create different environments suited for different learning tasks and scenarios.
Sure! Here's the same information formatted as a Markdown document:
Below is an example of a config dict in Python where all the values are stored:
from MuJoCo_Gym.mujoco_rl import MuJoCo_RL
configDict = {
"agents": ["Agent1", "Agent2"], # List of agents (default: [])
"xmlPath": "/path/to/xml/file.xml", # XML file path
"infoJson": "/path/to/info.json", # Info JSON file path (default: "")
"renderMode": True, # Render mode (default: False)
"exportPath": "/path/to/export", # Export path
"freeJoint": True, # Free joint option (default: False)
"skipFrames": 2, # Number of frames to skip (default: 1)
"maxSteps": 2048, # Maximum number of steps (default: 1024)
"rewardFunctions": [reward1, reward2], # List of reward functions (default: [])
"doneFunctions": [done1, done2], # List of done functions (default: [])
"environmentDynamics": [dyn1, dyn2], # List of environment dynamics (default: [])
"agentCameras": True, # Agent cameras option (default: False)
"sensorResolution": (64, 64) # Camera rendering resolution (default: (64, 64))
}
environment = MuJoCo_RL(config_dict)
Explanation of the default values:
-
agents
: The default value is an empty list[]
. -
xmlPath
: There is no default value specified. -
infoJson
: The default value is an empty stringNone
. -
renderMode
: The default value isFalse
. -
exportPath
: There is no default value specified. -
freeJoint
: The default value isFalse
. -
skipFrames
: The default value is1
. -
maxSteps
: The default value is1024
. -
rewardFunctions
: The default value is an empty list[]
. -
doneFunctions
: The default value is an empty list[]
. -
environmentDynamics
: The default value is an empty list[]
. -
agentCameras
: The default value isFalse
.
You can customize these values in the configDict
according to your specific requirements.
Developed by Microcosm.AI