Skip to content

Commit

Permalink
Merge pull request #176 from utiasDSL/dev-sb3
Browse files Browse the repository at this point in the history
Stable-baselines3 2.0 examples
  • Loading branch information
JacopoPan authored Nov 26, 2023
2 parents c91605b + b155e60 commit b1bac77
Show file tree
Hide file tree
Showing 36 changed files with 498 additions and 1,681 deletions.
2 changes: 0 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ examples/debug.py
examples/test.py

# NumPy saves and videos
files/logs/*.npy
files/videos/*.mp4
results/
tmp/

Expand Down
32 changes: 21 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ This is a minimalist refactoring of the original `gym-pybullet-drones` repositor

> **NOTE**: if you prefer to access the original codebase, presented at IROS in 2021, please `git checkout [paper|master]` after cloning the repo, and refer to the corresponding `README.md`'s.
<img src="files/readme_images/helix.gif" alt="formation flight" width="350"> <img src="files/readme_images/helix.png" alt="control info" width="450">
<img src="gym_pybullet_drones/assets/helix.gif" alt="formation flight" width="350"> <img src="gym_pybullet_drones/assets/helix.png" alt="control info" width="450">

## Installation

Expand All @@ -24,18 +24,27 @@ pip3 install -e . # if needed, `sudo apt install build-essential` to install `gc

## Use

### PID position control example
### PID control examples

```sh
cd gym_pybullet_drones/examples/
python3 pid.py
python3 pid.py # position and velocity reference
python3 pid_velocity.py # desired velocity reference
```

### Stable-baselines3 PPO RL example
### Downwash effect example

```sh
cd gym_pybullet_drones/examples/
python3 learn.py
python3 downwash.py
```

### Stable-baselines3 PPO RL 3'-training examples

```sh
cd gym_pybullet_drones/examples/
python learn.py # task: single drone hover at z == 1
python learn.py --multiagent true # task: 2-drone hover at z == 1.2 and 0.7
```

### Betaflight SITL example (Ubuntu only)
Expand Down Expand Up @@ -96,13 +105,14 @@ If you wish, please cite our [IROS 2021 paper](https://arxiv.org/abs/2103.02142)
- C. Karen Liu and Dan Negrut (2020) [*The Role of Physics-Based Simulators in Robotics*](https://www.annualreviews.org/doi/pdf/10.1146/annurev-control-072220-093055)
- Yunlong Song, Selim Naji, Elia Kaufmann, Antonio Loquercio, and Davide Scaramuzza (2020) [*Flightmare: A Flexible Quadrotor Simulator*](https://arxiv.org/pdf/2009.00563.pdf)

## TODO
## Core Team WIP

- [ ] Multi-drone `crazyflie-firmware` SITL support (@spencerteetaert, @JacopoPan)

## Desired Contributions/PRs

- [ ] Add `crazyflie-firmware` SITL support @spencerteetaert
- [ ] Add motor delay @JacopoPan / @spencerteetaert
- [ ] Replace `rpy` with quaternions (and `ang_vel` with body rates) in `obs` @JacopoPan
- [ ] Replace `BaseSingleAgentAviary` and `BaseMultiAgentAviary` with a single `RLAviary`, incl. PR #161 @JacopoPan
- [ ] Add a multi-agent MDP with 2-drone chase through a gate @JacopoPan
- [ ] Add motor delay by implementing a buffer in `BaseAviary._dynamics()`
- [ ] Replace `rpy` with quaternions (and `ang_vel` with body rates) by editing `BaseAviary._updateAndStoreKinematicInformation()`, `BaseAviary._getDroneStateVector()`, and the `.computeObs()` methods of relevant subclasses

-----
> University of Toronto's [Dynamic Systems Lab](https://github.com/utiasDSL) / [Vector Institute](https://github.com/VectorInstitute) / University of Cambridge's [Prorok Lab](https://github.com/proroklab)
Empty file removed files/logs/foo.txt
Empty file.
Binary file removed files/readme_images/2020.gif
Binary file not shown.
Binary file removed files/readme_images/dep.gif
Binary file not shown.
Binary file removed files/readme_images/downwash.gif
Binary file not shown.
Binary file removed files/readme_images/downwash.png
Binary file not shown.
Binary file removed files/readme_images/rgb.gif
Binary file not shown.
Binary file removed files/readme_images/seg.gif
Binary file not shown.
16 changes: 3 additions & 13 deletions gym_pybullet_drones/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,20 +12,10 @@

register(
id='hover-aviary-v0',
entry_point='gym_pybullet_drones.envs.single_agent_rl:HoverAviary',
entry_point='gym_pybullet_drones.envs:HoverAviary',
)

register(
id='flythrugate-aviary-v0',
entry_point='gym_pybullet_drones.envs.single_agent_rl:FlyThruGateAviary',
)

register(
id='flock-aviary-v0',
entry_point='gym_pybullet_drones.envs.multi_agent_rl:FlockAviary',
)

register(
id='leaderfollower-aviary-v0',
entry_point='gym_pybullet_drones.envs.multi_agent_rl:LeaderFollowerAviary',
id='multihover-aviary-v0',
entry_point='gym_pybullet_drones.envs:MultiHoverAviary',
)
File renamed without changes.
File renamed without changes.
6 changes: 2 additions & 4 deletions gym_pybullet_drones/assets/clone_bfs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ cd ../../
mkdir betaflight_sitl/
cd betaflight_sitl/

# Step 1: Clone and open betaflight's source:
# Step 1: Clone and open betaflight's source (at the time of writing, branch `master`, future release 4.5)):
git clone https://github.com/betaflight/betaflight temp/


Expand All @@ -27,8 +27,6 @@ git clone https://github.com/betaflight/betaflight temp/
# from Betaflight's `SIMULATOR_BUILD`
cd temp/
sed -i "s/delayMicroseconds_real(50);/\/\/delayMicroseconds_real(50);/g" ./src/main/main.c
sed -i "s/ret = udpInit(\&stateLink, NULL, 9003, true);/\/\/ret = udpInit(\&stateLink, NULL, PORT_STATE, true);/g" ./src/main/target/SITL/sitl.c
sed -i "s/printf(\"start UDP server.../\/\/printf(\"start UDP server.../g" ./src/main/target/SITL/sitl.c

# Prepare
make arm_sdk_install
Expand All @@ -47,7 +45,7 @@ for ((i = 0; i < desired_max_num_drones; i++)); do
cp -r temp/ "bf${i}/"
cd "bf${i}/"

# Step 3: Change the UDP ports used by each Betaflight SITL instancet
# Step 3: Change the UDP ports used by each Betaflight SITL instance
replacement1="PORT_PWM_RAW 90${i}1"
sed -i "s/$pattern1/$replacement1/g" ./src/main/target/SITL/sitl.c
replacement2="PORT_PWM 90${i}2"
Expand Down
File renamed without changes.
File renamed without changes
File renamed without changes
21 changes: 1 addition & 20 deletions gym_pybullet_drones/envs/BaseAviary.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ def __init__(self,
gui : bool, optional
Whether to use PyBullet's GUI.
record : bool, optional
Whether to save a video of the simulation in folder `files/videos/`.
Whether to save a video of the simulation.
obstacles : bool, optional
Whether to add obstacles to the simulation.
user_debug_gui : bool, optional
Expand Down Expand Up @@ -338,7 +338,6 @@ def step(self,
) for i in range(self.NUM_DRONES)]
#### Save, preprocess, and clip the action to the max. RPM #
else:
self._saveLastAction(action)
clipped_action = np.reshape(self._preprocessAction(action), (self.NUM_DRONES, 4))
#### Repeat for as many as the aggregate physics steps #####
for _ in range(self.PYB_STEPS_PER_CTRL):
Expand Down Expand Up @@ -466,7 +465,6 @@ def _housekeeping(self):
self.GUI_INPUT_TEXT = -1*np.ones(self.NUM_DRONES)
self.USE_GUI_RPM=False
self.last_input_switch = 0
self.last_action = -1*np.ones((self.NUM_DRONES, 4))
self.last_clipped_action = np.zeros((self.NUM_DRONES, 4))
self.gui_input = np.zeros(4)
#### Initialize the drones kinemaatic information ##########
Expand Down Expand Up @@ -916,23 +914,6 @@ def _normalizedActionToRPM(self,

################################################################################

def _saveLastAction(self,
action
):
"""Stores the most recent action into attribute `self.last_action`.
The last action can be used to compute aerodynamic effects.
Parameters
----------
action : ndarray
Ndarray containing the current RPMs input for each drone.
"""
self.last_action = np.reshape(action, (self.NUM_DRONES, 4))

################################################################################

def _showDroneLocalAxes(self,
nth_drone
):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,20 @@
import numpy as np
import pybullet as p
from gymnasium import spaces
from collections import deque

from gym_pybullet_drones.envs.BaseAviary import BaseAviary
from gym_pybullet_drones.utils.enums import DroneModel, Physics, ActionType, ObservationType, ImageType
from gym_pybullet_drones.control.DSLPIDControl import DSLPIDControl

class BaseMultiagentAviary(BaseAviary):
"""Base multi-agent environment class for reinforcement learning."""
class BaseRLAviary(BaseAviary):
"""Base single and multi-agent environment class for reinforcement learning."""

################################################################################

def __init__(self,
drone_model: DroneModel=DroneModel.CF2X,
num_drones: int=2,
num_drones: int=1,
neighbourhood_radius: float=np.inf,
initial_xyzs=None,
initial_rpys=None,
Expand All @@ -26,7 +27,7 @@ def __init__(self,
obs: ObservationType=ObservationType.KIN,
act: ActionType=ActionType.RPM
):
"""Initialization of a generic multi-agent RL environment.
"""Initialization of a generic single and multi-agent RL environment.
Attributes `vision_attributes` and `dynamics_attributes` are selected
based on the choice of `obs` and `act`; `obstacles` is set to True
Expand Down Expand Up @@ -54,27 +55,27 @@ def __init__(self,
gui : bool, optional
Whether to use PyBullet's GUI.
record : bool, optional
Whether to save a video of the simulation in folder `files/videos/`.
Whether to save a video of the simulation.
obs : ObservationType, optional
The type of observation space (kinematic information or vision)
act : ActionType, optional
The type of action space (1 or 3D; RPMS, thurst and torques, waypoint or velocity with PID control; etc.)
"""
if num_drones < 2:
print("[ERROR] in BaseMultiagentAviary.__init__(), num_drones should be >= 2")
exit()
#### Create a buffer for the last 10 actions ###############
self.ACTION_BUFFER_SIZE = 10
self.action_buffer = deque(maxlen=self.ACTION_BUFFER_SIZE)
####
vision_attributes = True if obs == ObservationType.RGB else False
self.OBS_TYPE = obs
self.ACT_TYPE = act
self.EPISODE_LEN_SEC = 5
#### Create integrated controllers #########################
if act in [ActionType.PID, ActionType.VEL, ActionType.ONE_D_PID]:
os.environ['KMP_DUPLICATE_LIB_OK']='True'
if drone_model in [DroneModel.CF2X, DroneModel.CF2P]:
self.ctrl = [DSLPIDControl(drone_model=DroneModel.CF2X) for i in range(num_drones)]
else:
print("[ERROR] in BaseMultiagentAviary.__init()__, no controller is available for the specified drone_model")
print("[ERROR] in BaseRLAviary.__init()__, no controller is available for the specified drone_model")
super().__init__(drone_model=drone_model,
num_drones=num_drones,
neighbourhood_radius=neighbourhood_radius,
Expand Down Expand Up @@ -144,10 +145,14 @@ def _actionSpace(self):
elif self.ACT_TYPE in [ActionType.ONE_D_RPM, ActionType.ONE_D_PID]:
size = 1
else:
print("[ERROR] in BaseMultiagentAviary._actionSpace()")
print("[ERROR] in BaseRLAviary._actionSpace()")
exit()
act_lower_bound = np.array([-1*np.ones(size) for i in range(self.NUM_DRONES)])
act_upper_bound = np.array([+1*np.ones(size) for i in range(self.NUM_DRONES)])
#
for i in range(self.ACTION_BUFFER_SIZE):
self.action_buffer.append(np.zeros((self.NUM_DRONES,size)))
#
return spaces.Box(low=act_lower_bound, high=act_upper_bound, dtype=np.float32)

################################################################################
Expand Down Expand Up @@ -179,6 +184,7 @@ def _preprocessAction(self,
commanded to the 4 motors of each drone.
"""
self.action_buffer.append(action)
rpm = np.zeros((self.NUM_DRONES,4))
for k in range(action.shape[0]):
target = action[k, :]
Expand Down Expand Up @@ -219,16 +225,16 @@ def _preprocessAction(self,
rpm[k,:] = np.repeat(self.HOVER_RPM * (1+0.05*target), 4)
elif self.ACT_TYPE == ActionType.ONE_D_PID:
state = self._getDroneStateVector(k)
rpm, _, _ = self.ctrl[k].computeControl(control_timestep=self.CTRL_TIMESTEP,
res, _, _ = self.ctrl[k].computeControl(control_timestep=self.CTRL_TIMESTEP,
cur_pos=state[0:3],
cur_quat=state[3:7],
cur_vel=state[10:13],
cur_ang_vel=state[13:16],
target_pos=state[0:3]+0.1*np.array([0,0,target[0]])
)
rpm[k,:] = rpm
rpm[k,:] = res
else:
print("[ERROR] in BaseMultiagentAviary._preprocessAction()")
print("[ERROR] in BaseRLAviary._preprocessAction()")
exit()
return rpm

Expand All @@ -244,28 +250,34 @@ def _observationSpace(self):
"""
if self.OBS_TYPE == ObservationType.RGB:
# return spaces.Dict({i: spaces.Box(low=0,
# high=255,
# shape=(self.IMG_RES[1], self.IMG_RES[0], 4), dtype=np.uint8
# ) for i in range(self.NUM_DRONES)})
return spaces.Box(low=0,
high=255,
shape=(self.NUM_DRONES, self.IMG_RES[1], self.IMG_RES[0], 4), dtype=np.uint8)
elif self.OBS_TYPE == ObservationType.KIN:
############################################################
#### OBS OF SIZE 20 (WITH QUATERNION AND RPMS)
#### Observation vector ### X Y Z Q1 Q2 Q3 Q4 R P Y VX VY VZ WX WY WZ P0 P1 P2 P3
# obs_lower_bound = np.array([-1, -1, 0, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1])
# obs_upper_bound = np.array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
# return spaces.Box( low=obs_lower_bound, high=obs_upper_bound, dtype=np.float32 )
############################################################
#### OBS SPACE OF SIZE 12
obs_lower_bound = np.array([[-1,-1,0, -1,-1,-1, -1,-1,-1, -1,-1,-1] for i in range(self.NUM_DRONES)])
obs_upper_bound = np.array([[1,1,1, 1,1,1, 1,1,1, 1,1,1] for i in range(self.NUM_DRONES)])
#### Observation vector ### X Y Z Q1 Q2 Q3 Q4 R P Y VX VY VZ WX WY WZ
lo = -np.inf
hi = np.inf
obs_lower_bound = np.array([[lo,lo,0, lo,lo,lo,lo,lo,lo,lo,lo,lo] for i in range(self.NUM_DRONES)])
obs_upper_bound = np.array([[hi,hi,hi,hi,hi,hi,hi,hi,hi,hi,hi,hi] for i in range(self.NUM_DRONES)])
#### Add action buffer to observation space ################
act_lo = -1
act_hi = +1
for i in range(self.ACTION_BUFFER_SIZE):
if self.ACT_TYPE in [ActionType.RPM, ActionType.VEL]:
obs_lower_bound = np.hstack([obs_lower_bound, np.array([[act_lo,act_lo,act_lo,act_lo] for i in range(self.NUM_DRONES)])])
obs_upper_bound = np.hstack([obs_upper_bound, np.array([[act_hi,act_hi,act_hi,act_hi] for i in range(self.NUM_DRONES)])])
elif self.ACT_TYPE==ActionType.PID:
obs_lower_bound = np.hstack([obs_lower_bound, np.array([[act_lo,act_lo,act_lo] for i in range(self.NUM_DRONES)])])
obs_upper_bound = np.hstack([obs_upper_bound, np.array([[act_hi,act_hi,act_hi] for i in range(self.NUM_DRONES)])])
elif self.ACT_TYPE in [ActionType.ONE_D_RPM, ActionType.ONE_D_PID]:
obs_lower_bound = np.hstack([obs_lower_bound, np.array([[act_lo] for i in range(self.NUM_DRONES)])])
obs_upper_bound = np.hstack([obs_upper_bound, np.array([[act_hi] for i in range(self.NUM_DRONES)])])
return spaces.Box(low=obs_lower_bound, high=obs_upper_bound, dtype=np.float32)
############################################################
else:
print("[ERROR] in BaseMultiagentAviary._observationSpace()")
print("[ERROR] in BaseRLAviary._observationSpace()")

################################################################################

Expand All @@ -291,35 +303,20 @@ def _computeObs(self):
path=self.ONBOARD_IMG_PATH+"drone_"+str(i),
frame_num=int(self.step_counter/self.IMG_CAPTURE_FREQ)
)
return np.array([self.rgb[i] for i in range(self.NUM_DRONES)])
return np.array([self.rgb[i] for i in range(self.NUM_DRONES)]).astype('float32')
elif self.OBS_TYPE == ObservationType.KIN:
############################################################
#### OBS OF SIZE 20 (WITH QUATERNION AND RPMS)
# return { i : self._clipAndNormalizeState(self._getDroneStateVector(i)) for i in range(self.NUM_DRONES) }
############################################################
#### OBS SPACE OF SIZE 12
obs_12 = np.zeros((self.NUM_DRONES,12))
for i in range(self.NUM_DRONES):
obs = self._clipAndNormalizeState(self._getDroneStateVector(i))
#obs = self._clipAndNormalizeState(self._getDroneStateVector(i))
obs = self._getDroneStateVector(i)
obs_12[i, :] = np.hstack([obs[0:3], obs[7:10], obs[10:13], obs[13:16]]).reshape(12,)
return np.array([obs_12[i, :] for i in range(self.NUM_DRONES)])
ret = np.array([obs_12[i, :] for i in range(self.NUM_DRONES)]).astype('float32')
#### Add action buffer to observation #######################
for i in range(self.ACTION_BUFFER_SIZE):
ret = np.hstack([ret, np.array([self.action_buffer[i][j, :] for j in range(self.NUM_DRONES)])])
return ret
############################################################
else:
print("[ERROR] in BaseMultiagentAviary._computeObs()")

################################################################################

def _clipAndNormalizeState(self,
state
):
"""Normalizes a drone's state to the [-1,1] range.
Must be implemented in a subclass.
Parameters
----------
state : ndarray
Array containing the non-normalized state of a single drone.
"""
raise NotImplementedError
print("[ERROR] in BaseRLAviary._computeObs()")
2 changes: 1 addition & 1 deletion gym_pybullet_drones/envs/BetaAviary.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ def __init__(self,
gui : bool, optional
Whether to use PyBullet's GUI.
record : bool, optional
Whether to save a video of the simulation in folder `files/videos/`.
Whether to save a video of the simulation.
obstacles : bool, optional
Whether to add obstacles to the simulation.
user_debug_gui : bool, optional
Expand Down
2 changes: 1 addition & 1 deletion gym_pybullet_drones/envs/CtrlAviary.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ def __init__(self,
gui : bool, optional
Whether to use PyBullet's GUI.
record : bool, optional
Whether to save a video of the simulation in folder `files/videos/`.
Whether to save a video of the simulation.
obstacles : bool, optional
Whether to add obstacles to the simulation.
user_debug_gui : bool, optional
Expand Down
Loading

0 comments on commit b1bac77

Please sign in to comment.