Skip to content

Commit 707a136

Browse files
committed
release
1 parent da2af7a commit 707a136

File tree

189 files changed

+15570
-2
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

189 files changed

+15570
-2
lines changed

.gitignore

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -157,4 +157,13 @@ cython_debug/
157157
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
158158
# and can be added to the global gitignore or merged into this file. For a more nuclear
159159
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
160-
#.idea/
160+
.idea/
161+
162+
*.pt
163+
*.pth
164+
*pl
165+
*.patch
166+
*used_configs
167+
.allenact_last_start_time_string
168+
*.lock
169+
*wandb

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
MIT License
22

3-
Copyright (c) 2023 Zichen "Charles" Zhang
3+
Copyright (c) 2023 UVD
44

55
Permission is hereby granted, free of charge, to any person obtaining a copy
66
of this software and associated documentation files (the "Software"), to deal

README.md

Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
# Universal Visual Decomposer: <br>Long-Horizon Manipulation Made Easy
2+
3+
<div align="center">
4+
5+
[[Website]](https://cec-agent.github.io/)
6+
[[arXiv]](https://zcczhang.github.io/UVD/)
7+
[[PDF]](https://zcczhang.github.io/UVD/assets/pdf/full_paper.pdf)
8+
[[Installation]](#Installation)
9+
[[Usage]](#Usage)
10+
[[BibTex]](#Citation)
11+
______________________________________________________________________
12+
13+
14+
15+
16+
</div>
17+
18+
# Installation
19+
20+
- Follow the [instruction](https://github.com/openai/mujoco-py#install-mujoco) for installing `mujuco-py` and install the following apt packages if using Ubuntu:
21+
```commandline
22+
sudo apt install -y libosmesa6-dev libgl1-mesa-glx libglfw3 patchelf
23+
```
24+
- create conda env with Python==3.9
25+
```commandline
26+
conda create -n uvd python==3.9 -y && conda activate uvd
27+
```
28+
- Install any/all standalone visual foundation models from their repos separately *before* setup UVD, in case dependency conflicts, e.g.:
29+
<details><summary>
30+
<a href="https://github.com/facebookresearch/vip">VIP</a>
31+
</summary>
32+
<p>
33+
34+
```commandline
35+
git clone https://github.com/facebookresearch/vip.git
36+
cd vip && pip install -e .
37+
python -c "from vip import load_vip; vip = load_vip()"
38+
```
39+
40+
</p>
41+
</details>
42+
43+
<details><summary>
44+
<a href="https://github.com/facebookresearch/r3m">R3M</a>
45+
</summary>
46+
<p>
47+
48+
```commandline
49+
git clone https://github.com/facebookresearch/r3m.git
50+
cd r3m && pip install -e .
51+
python -c "from r3m import load_r3m; r3m = load_r3m('resnet50')"
52+
```
53+
54+
</p>
55+
</details>
56+
57+
<details><summary>
58+
<a href="https://github.com/penn-pal-lab/LIV">LIV (& CLIP)</a>
59+
</summary>
60+
<p>
61+
62+
```commandline
63+
git clone https://github.com/penn-pal-lab/LIV.git
64+
cd LIV && pip install -e . && cd liv/models/clip && pip install -e .
65+
python -c "from liv import load_liv; liv = load_liv()"
66+
```
67+
68+
</p>
69+
</details>
70+
71+
72+
<details><summary>
73+
<a href="https://github.com/facebookresearch/eai-vc">VC1</a>
74+
</summary>
75+
<p>
76+
77+
```commandline
78+
git clone https://github.com/facebookresearch/eai-vc.git
79+
cd eai-vc && pip install -e vc_models
80+
```
81+
82+
</p>
83+
</details>
84+
85+
<details><summary>
86+
<a href="https://github.com/facebookresearch/dinov2">DINOv2</a> and <a href="https://pytorch.org/vision/main/models/generated/torchvision.models.resnet50.html">ResNet</a> pretrained with ImageNet-1k are directly loaded via <a href="https://pytorch.org/hub/">torch hub</a> and <a href="https://pytorch.org/vision/main/models/generated/torchvision.models.resnet50.html">torchvision</a>.
87+
</summary></details>
88+
89+
- Under *this* UVD repo directory, install other dependencies
90+
```commandline
91+
pip install -e .
92+
```
93+
94+
# Usage
95+
96+
We provide a simple API for decompose RGB videos:
97+
98+
```python
99+
import torch
100+
import uvd
101+
102+
# (N sub-goals, *video frame shape)
103+
subgoals = uvd.get_uvd_subgoals(
104+
"xxx.mp4", # video filename or (L, *video frame shape) video numpy array
105+
preprocessor_name="vip", # Literal["vip", "r3m", "liv", "clip", "vc1", "dinov2"]
106+
device="cuda" if torch.cuda.is_available() else "cpu", # device for loading frozen preprocessor
107+
return_indices=False, # True if only want the list of subgoal timesteps
108+
)
109+
```
110+
111+
or run
112+
```commandline
113+
python demo.py
114+
```
115+
to host a Gradio demo locally with different choices of visual representations.
116+
117+
## Simulation Data
118+
119+
We post-processed the data released from original [Relay-Policy-Learning](https://github.com/google-research/relay-policy-learning/tree/master) that keeps the successful trajectories only and adapt the control and observations used in our paper by:
120+
```commandline
121+
python datasets/data_gen.py raw_data_path=/PATH/TO/RAW_DATA
122+
```
123+
124+
Also consider to force set `Builder = LinuxCPUExtensionBuilder` to `Builder = LinuxGPUExtensionBuilder` in `PATH/TO/CONDA/envs/uvd/lib/python3.9/site-packages/mujoco_py/builder.py` to enable (multi-)GPU acceleration.
125+
126+
127+
## Runtime Benchmark
128+
129+
Since UVD's goal is to be an off-the-shelf method applying to *any* existing policy learning frameworks and models, across BC and RL, we provide minimal scripts for benchmarking the runtime showing negligible runtime under `./scripts` directory:
130+
```commandline
131+
python scripts/benchmark_decomp.py /PATH/TO/VIDEO
132+
```
133+
and passing `--preprocessor_name` with other preprocessors (default `vip`) and `--n` for the number of repeated iterations (default `100`).
134+
135+
For inference or rollouts, we benchmark the runtime by
136+
```commandline
137+
python scripts/benchmark_inference.py
138+
```
139+
and passing `--policy` for using MLP or causal GPT policy; `--preprocessor_name` with other preprocessors (default `vip`); `--use_uvd` as boolean arg for whether using UVD or no decomposition (i.e. final goal conditioned); and `--n` for the number of repeated iterations (default `100`). The default episode horizon is set to 300. We found that running in the terminal would be almost 2s slower every episode than directly running with python IDE (e.g. PyCharm, under the script directory and run as script instead of module), but the general trend that including UVD introduces negligible extra runtime still holds true.
140+
141+
# Citation
142+
If you find this project useful in your research, please consider citing:
143+
144+
```bibtex
145+
@misc{zhang2023universal,
146+
title = {Universal Visual Decomposer: Long-Horizon Manipulation Made Easy},
147+
author = {Zichen Zhang and Yunshuang Li and Osbert Bastani and Abhishek Gupta and Dinesh Jayaraman and Yecheng Jason Ma and Luca Weihs},
148+
title = {Universal Visual Decomposer: Long-Horizon Manipulation Made Easy},
149+
year = {2023},
150+
eprint = {arXiv:2310.08581},
151+
}
152+
```

datasets/.gitignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
*
2+
!.gitignore
3+
!data_gen.py
4+
!generate_in_domain_vip_ft_data.py
5+
!data_gen.yaml

0 commit comments

Comments
 (0)