Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stable-baselines3 2.0 examples #176

Merged
merged 21 commits into from
Nov 26, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ examples/debug.py
examples/test.py

# NumPy saves and videos
files/logs/*.npy
files/videos/*.mp4
results/
tmp/

Expand Down
32 changes: 21 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ This is a minimalist refactoring of the original `gym-pybullet-drones` repositor

> **NOTE**: if you prefer to access the original codebase, presented at IROS in 2021, please `git checkout [paper|master]` after cloning the repo, and refer to the corresponding `README.md`'s.

<img src="files/readme_images/helix.gif" alt="formation flight" width="350"> <img src="files/readme_images/helix.png" alt="control info" width="450">
<img src="gym_pybullet_drones/assets/helix.gif" alt="formation flight" width="350"> <img src="gym_pybullet_drones/assets/helix.png" alt="control info" width="450">

## Installation

Expand All @@ -24,18 +24,27 @@ pip3 install -e . # if needed, `sudo apt install build-essential` to install `gc

## Use

### PID position control example
### PID control examples

```sh
cd gym_pybullet_drones/examples/
python3 pid.py
python3 pid.py # position and velocity reference
python3 pid_velocity.py # desired velocity reference
```

### Stable-baselines3 PPO RL example
### Downwash effect example

```sh
cd gym_pybullet_drones/examples/
python3 learn.py
python3 downwash.py
```

### Stable-baselines3 PPO RL 3'-training examples

```sh
cd gym_pybullet_drones/examples/
python learn.py # task: single drone hover at z == 1
python learn.py --multiagent true # task: 2-drone hover at z == 1.2 and 0.7
```

### Betaflight SITL example (Ubuntu only)
Expand Down Expand Up @@ -96,13 +105,14 @@ If you wish, please cite our [IROS 2021 paper](https://arxiv.org/abs/2103.02142)
- C. Karen Liu and Dan Negrut (2020) [*The Role of Physics-Based Simulators in Robotics*](https://www.annualreviews.org/doi/pdf/10.1146/annurev-control-072220-093055)
- Yunlong Song, Selim Naji, Elia Kaufmann, Antonio Loquercio, and Davide Scaramuzza (2020) [*Flightmare: A Flexible Quadrotor Simulator*](https://arxiv.org/pdf/2009.00563.pdf)

## TODO
## Core Team WIP

- [ ] Multi-drone `crazyflie-firmware` SITL support (@spencerteetaert, @JacopoPan)

## Desired Contributions/PRs

- [ ] Add `crazyflie-firmware` SITL support @spencerteetaert
- [ ] Add motor delay @JacopoPan / @spencerteetaert
- [ ] Replace `rpy` with quaternions (and `ang_vel` with body rates) in `obs` @JacopoPan
- [ ] Replace `BaseSingleAgentAviary` and `BaseMultiAgentAviary` with a single `RLAviary`, incl. PR #161 @JacopoPan
- [ ] Add a multi-agent MDP with 2-drone chase through a gate @JacopoPan
- [ ] Add motor delay by implementing a buffer in `BaseAviary._dynamics()`
- [ ] Replace `rpy` with quaternions (and `ang_vel` with body rates) by editing `BaseAviary._updateAndStoreKinematicInformation()`, `BaseAviary._getDroneStateVector()`, and the `.computeObs()` methods of relevant subclasses

-----
> University of Toronto's [Dynamic Systems Lab](https://github.com/utiasDSL) / [Vector Institute](https://github.com/VectorInstitute) / University of Cambridge's [Prorok Lab](https://github.com/proroklab)
Empty file removed files/logs/foo.txt
Empty file.
Binary file removed files/readme_images/2020.gif
Binary file not shown.
Binary file removed files/readme_images/dep.gif
Binary file not shown.
Binary file removed files/readme_images/downwash.gif
Binary file not shown.
Binary file removed files/readme_images/downwash.png
Binary file not shown.
Binary file removed files/readme_images/rgb.gif
Binary file not shown.
Binary file removed files/readme_images/seg.gif
Binary file not shown.
16 changes: 3 additions & 13 deletions gym_pybullet_drones/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,20 +12,10 @@

register(
id='hover-aviary-v0',
entry_point='gym_pybullet_drones.envs.single_agent_rl:HoverAviary',
entry_point='gym_pybullet_drones.envs:HoverAviary',
)

register(
id='flythrugate-aviary-v0',
entry_point='gym_pybullet_drones.envs.single_agent_rl:FlyThruGateAviary',
)

register(
id='flock-aviary-v0',
entry_point='gym_pybullet_drones.envs.multi_agent_rl:FlockAviary',
)

register(
id='leaderfollower-aviary-v0',
entry_point='gym_pybullet_drones.envs.multi_agent_rl:LeaderFollowerAviary',
id='multihover-aviary-v0',
entry_point='gym_pybullet_drones.envs:MultiHoverAviary',
)
6 changes: 2 additions & 4 deletions gym_pybullet_drones/assets/clone_bfs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ cd ../../
mkdir betaflight_sitl/
cd betaflight_sitl/

# Step 1: Clone and open betaflight's source:
# Step 1: Clone and open betaflight's source (at the time of writing, branch `master`, future release 4.5)):
git clone https://github.com/betaflight/betaflight temp/


Expand All @@ -27,8 +27,6 @@ git clone https://github.com/betaflight/betaflight temp/
# from Betaflight's `SIMULATOR_BUILD`
cd temp/
sed -i "s/delayMicroseconds_real(50);/\/\/delayMicroseconds_real(50);/g" ./src/main/main.c
sed -i "s/ret = udpInit(\&stateLink, NULL, 9003, true);/\/\/ret = udpInit(\&stateLink, NULL, PORT_STATE, true);/g" ./src/main/target/SITL/sitl.c
sed -i "s/printf(\"start UDP server.../\/\/printf(\"start UDP server.../g" ./src/main/target/SITL/sitl.c

# Prepare
make arm_sdk_install
Expand All @@ -47,7 +45,7 @@ for ((i = 0; i < desired_max_num_drones; i++)); do
cp -r temp/ "bf${i}/"
cd "bf${i}/"

# Step 3: Change the UDP ports used by each Betaflight SITL instancet
# Step 3: Change the UDP ports used by each Betaflight SITL instance
replacement1="PORT_PWM_RAW 90${i}1"
sed -i "s/$pattern1/$replacement1/g" ./src/main/target/SITL/sitl.c
replacement2="PORT_PWM 90${i}2"
Expand Down
File renamed without changes
File renamed without changes
21 changes: 1 addition & 20 deletions gym_pybullet_drones/envs/BaseAviary.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ def __init__(self,
gui : bool, optional
Whether to use PyBullet's GUI.
record : bool, optional
Whether to save a video of the simulation in folder `files/videos/`.
Whether to save a video of the simulation.
obstacles : bool, optional
Whether to add obstacles to the simulation.
user_debug_gui : bool, optional
Expand Down Expand Up @@ -338,7 +338,6 @@ def step(self,
) for i in range(self.NUM_DRONES)]
#### Save, preprocess, and clip the action to the max. RPM #
else:
self._saveLastAction(action)
JacopoPan marked this conversation as resolved.
Show resolved Hide resolved
clipped_action = np.reshape(self._preprocessAction(action), (self.NUM_DRONES, 4))
#### Repeat for as many as the aggregate physics steps #####
for _ in range(self.PYB_STEPS_PER_CTRL):
Expand Down Expand Up @@ -466,7 +465,6 @@ def _housekeeping(self):
self.GUI_INPUT_TEXT = -1*np.ones(self.NUM_DRONES)
self.USE_GUI_RPM=False
self.last_input_switch = 0
self.last_action = -1*np.ones((self.NUM_DRONES, 4))
self.last_clipped_action = np.zeros((self.NUM_DRONES, 4))
self.gui_input = np.zeros(4)
#### Initialize the drones kinemaatic information ##########
Expand Down Expand Up @@ -916,23 +914,6 @@ def _normalizedActionToRPM(self,

################################################################################

def _saveLastAction(self,
action
):
"""Stores the most recent action into attribute `self.last_action`.

The last action can be used to compute aerodynamic effects.

Parameters
----------
action : ndarray
Ndarray containing the current RPMs input for each drone.

"""
self.last_action = np.reshape(action, (self.NUM_DRONES, 4))

################################################################################

def _showDroneLocalAxes(self,
nth_drone
):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,20 @@
import numpy as np
import pybullet as p
from gymnasium import spaces
from collections import deque

from gym_pybullet_drones.envs.BaseAviary import BaseAviary
from gym_pybullet_drones.utils.enums import DroneModel, Physics, ActionType, ObservationType, ImageType
from gym_pybullet_drones.control.DSLPIDControl import DSLPIDControl

class BaseMultiagentAviary(BaseAviary):
"""Base multi-agent environment class for reinforcement learning."""
class BaseRLAviary(BaseAviary):
"""Base single and multi-agent environment class for reinforcement learning."""

################################################################################

def __init__(self,
drone_model: DroneModel=DroneModel.CF2X,
num_drones: int=2,
num_drones: int=1,
neighbourhood_radius: float=np.inf,
initial_xyzs=None,
initial_rpys=None,
Expand All @@ -26,7 +27,7 @@ def __init__(self,
obs: ObservationType=ObservationType.KIN,
act: ActionType=ActionType.RPM
):
"""Initialization of a generic multi-agent RL environment.
"""Initialization of a generic single and multi-agent RL environment.

Attributes `vision_attributes` and `dynamics_attributes` are selected
based on the choice of `obs` and `act`; `obstacles` is set to True
Expand Down Expand Up @@ -54,27 +55,27 @@ def __init__(self,
gui : bool, optional
Whether to use PyBullet's GUI.
record : bool, optional
Whether to save a video of the simulation in folder `files/videos/`.
Whether to save a video of the simulation.
obs : ObservationType, optional
The type of observation space (kinematic information or vision)
act : ActionType, optional
The type of action space (1 or 3D; RPMS, thurst and torques, waypoint or velocity with PID control; etc.)

"""
if num_drones < 2:
print("[ERROR] in BaseMultiagentAviary.__init__(), num_drones should be >= 2")
exit()
#### Create a buffer for the last 10 actions ###############
self.ACTION_BUFFER_SIZE = 10
self.action_buffer = deque(maxlen=self.ACTION_BUFFER_SIZE)
####
vision_attributes = True if obs == ObservationType.RGB else False
self.OBS_TYPE = obs
self.ACT_TYPE = act
self.EPISODE_LEN_SEC = 5
#### Create integrated controllers #########################
if act in [ActionType.PID, ActionType.VEL, ActionType.ONE_D_PID]:
os.environ['KMP_DUPLICATE_LIB_OK']='True'
if drone_model in [DroneModel.CF2X, DroneModel.CF2P]:
self.ctrl = [DSLPIDControl(drone_model=DroneModel.CF2X) for i in range(num_drones)]
else:
print("[ERROR] in BaseMultiagentAviary.__init()__, no controller is available for the specified drone_model")
print("[ERROR] in BaseRLAviary.__init()__, no controller is available for the specified drone_model")
super().__init__(drone_model=drone_model,
num_drones=num_drones,
neighbourhood_radius=neighbourhood_radius,
Expand Down Expand Up @@ -144,10 +145,14 @@ def _actionSpace(self):
elif self.ACT_TYPE in [ActionType.ONE_D_RPM, ActionType.ONE_D_PID]:
size = 1
else:
print("[ERROR] in BaseMultiagentAviary._actionSpace()")
print("[ERROR] in BaseRLAviary._actionSpace()")
exit()
act_lower_bound = np.array([-1*np.ones(size) for i in range(self.NUM_DRONES)])
act_upper_bound = np.array([+1*np.ones(size) for i in range(self.NUM_DRONES)])
#
for i in range(self.ACTION_BUFFER_SIZE):
self.action_buffer.append(np.zeros((self.NUM_DRONES,size)))
#
return spaces.Box(low=act_lower_bound, high=act_upper_bound, dtype=np.float32)

################################################################################
Expand Down Expand Up @@ -179,6 +184,7 @@ def _preprocessAction(self,
commanded to the 4 motors of each drone.

"""
self.action_buffer.append(action)
rpm = np.zeros((self.NUM_DRONES,4))
for k in range(action.shape[0]):
target = action[k, :]
Expand Down Expand Up @@ -219,16 +225,16 @@ def _preprocessAction(self,
rpm[k,:] = np.repeat(self.HOVER_RPM * (1+0.05*target), 4)
elif self.ACT_TYPE == ActionType.ONE_D_PID:
state = self._getDroneStateVector(k)
rpm, _, _ = self.ctrl[k].computeControl(control_timestep=self.CTRL_TIMESTEP,
res, _, _ = self.ctrl[k].computeControl(control_timestep=self.CTRL_TIMESTEP,
cur_pos=state[0:3],
cur_quat=state[3:7],
cur_vel=state[10:13],
cur_ang_vel=state[13:16],
target_pos=state[0:3]+0.1*np.array([0,0,target[0]])
)
rpm[k,:] = rpm
rpm[k,:] = res
else:
print("[ERROR] in BaseMultiagentAviary._preprocessAction()")
print("[ERROR] in BaseRLAviary._preprocessAction()")
exit()
return rpm

Expand All @@ -244,28 +250,34 @@ def _observationSpace(self):

"""
if self.OBS_TYPE == ObservationType.RGB:
# return spaces.Dict({i: spaces.Box(low=0,
# high=255,
# shape=(self.IMG_RES[1], self.IMG_RES[0], 4), dtype=np.uint8
# ) for i in range(self.NUM_DRONES)})
return spaces.Box(low=0,
high=255,
shape=(self.NUM_DRONES, self.IMG_RES[1], self.IMG_RES[0], 4), dtype=np.uint8)
elif self.OBS_TYPE == ObservationType.KIN:
############################################################
#### OBS OF SIZE 20 (WITH QUATERNION AND RPMS)
#### Observation vector ### X Y Z Q1 Q2 Q3 Q4 R P Y VX VY VZ WX WY WZ P0 P1 P2 P3
# obs_lower_bound = np.array([-1, -1, 0, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1])
# obs_upper_bound = np.array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
# return spaces.Box( low=obs_lower_bound, high=obs_upper_bound, dtype=np.float32 )
############################################################
#### OBS SPACE OF SIZE 12
obs_lower_bound = np.array([[-1,-1,0, -1,-1,-1, -1,-1,-1, -1,-1,-1] for i in range(self.NUM_DRONES)])
obs_upper_bound = np.array([[1,1,1, 1,1,1, 1,1,1, 1,1,1] for i in range(self.NUM_DRONES)])
#### Observation vector ### X Y Z Q1 Q2 Q3 Q4 R P Y VX VY VZ WX WY WZ
lo = -np.inf
hi = np.inf
obs_lower_bound = np.array([[lo,lo,0, lo,lo,lo,lo,lo,lo,lo,lo,lo] for i in range(self.NUM_DRONES)])
obs_upper_bound = np.array([[hi,hi,hi,hi,hi,hi,hi,hi,hi,hi,hi,hi] for i in range(self.NUM_DRONES)])
#### Add action buffer to observation space ################
act_lo = -1
act_hi = +1
for i in range(self.ACTION_BUFFER_SIZE):
if self.ACT_TYPE in [ActionType.RPM, ActionType.VEL]:
obs_lower_bound = np.hstack([obs_lower_bound, np.array([[act_lo,act_lo,act_lo,act_lo] for i in range(self.NUM_DRONES)])])
obs_upper_bound = np.hstack([obs_upper_bound, np.array([[act_hi,act_hi,act_hi,act_hi] for i in range(self.NUM_DRONES)])])
elif self.ACT_TYPE==ActionType.PID:
obs_lower_bound = np.hstack([obs_lower_bound, np.array([[act_lo,act_lo,act_lo] for i in range(self.NUM_DRONES)])])
obs_upper_bound = np.hstack([obs_upper_bound, np.array([[act_hi,act_hi,act_hi] for i in range(self.NUM_DRONES)])])
elif self.ACT_TYPE in [ActionType.ONE_D_RPM, ActionType.ONE_D_PID]:
obs_lower_bound = np.hstack([obs_lower_bound, np.array([[act_lo] for i in range(self.NUM_DRONES)])])
obs_upper_bound = np.hstack([obs_upper_bound, np.array([[act_hi] for i in range(self.NUM_DRONES)])])
return spaces.Box(low=obs_lower_bound, high=obs_upper_bound, dtype=np.float32)
############################################################
else:
print("[ERROR] in BaseMultiagentAviary._observationSpace()")
print("[ERROR] in BaseRLAviary._observationSpace()")

################################################################################

Expand All @@ -291,35 +303,20 @@ def _computeObs(self):
path=self.ONBOARD_IMG_PATH+"drone_"+str(i),
frame_num=int(self.step_counter/self.IMG_CAPTURE_FREQ)
)
return np.array([self.rgb[i] for i in range(self.NUM_DRONES)])
return np.array([self.rgb[i] for i in range(self.NUM_DRONES)]).astype('float32')
elif self.OBS_TYPE == ObservationType.KIN:
############################################################
#### OBS OF SIZE 20 (WITH QUATERNION AND RPMS)
# return { i : self._clipAndNormalizeState(self._getDroneStateVector(i)) for i in range(self.NUM_DRONES) }
############################################################
#### OBS SPACE OF SIZE 12
obs_12 = np.zeros((self.NUM_DRONES,12))
for i in range(self.NUM_DRONES):
obs = self._clipAndNormalizeState(self._getDroneStateVector(i))
#obs = self._clipAndNormalizeState(self._getDroneStateVector(i))
obs = self._getDroneStateVector(i)
obs_12[i, :] = np.hstack([obs[0:3], obs[7:10], obs[10:13], obs[13:16]]).reshape(12,)
return np.array([obs_12[i, :] for i in range(self.NUM_DRONES)])
ret = np.array([obs_12[i, :] for i in range(self.NUM_DRONES)]).astype('float32')
#### Add action buffer to observation #######################
for i in range(self.ACTION_BUFFER_SIZE):
ret = np.hstack([ret, np.array([self.action_buffer[i][j, :] for j in range(self.NUM_DRONES)])])
return ret
############################################################
else:
print("[ERROR] in BaseMultiagentAviary._computeObs()")

################################################################################

def _clipAndNormalizeState(self,
state
):
"""Normalizes a drone's state to the [-1,1] range.

Must be implemented in a subclass.

Parameters
----------
state : ndarray
Array containing the non-normalized state of a single drone.

"""
raise NotImplementedError
print("[ERROR] in BaseRLAviary._computeObs()")
2 changes: 1 addition & 1 deletion gym_pybullet_drones/envs/BetaAviary.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ def __init__(self,
gui : bool, optional
Whether to use PyBullet's GUI.
record : bool, optional
Whether to save a video of the simulation in folder `files/videos/`.
Whether to save a video of the simulation.
obstacles : bool, optional
Whether to add obstacles to the simulation.
user_debug_gui : bool, optional
Expand Down
2 changes: 1 addition & 1 deletion gym_pybullet_drones/envs/CtrlAviary.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ def __init__(self,
gui : bool, optional
Whether to use PyBullet's GUI.
record : bool, optional
Whether to save a video of the simulation in folder `files/videos/`.
Whether to save a video of the simulation.
obstacles : bool, optional
Whether to add obstacles to the simulation.
user_debug_gui : bool, optional
Expand Down
Loading