This repository contains a version of our recent work Improving Unsupervised Visual Program Inference with Code Rewriting Families presented at ICCV 2023. This repository contains implementation of the different rewriters, and code to train models with PLAD as well as SIRI.
For training models for the VSIC challenge on the 3DCoMPaT dataset, please refer to the 3d_compat.md
markdown file.
- Running the following commands will create a conda environment named
coref
with the necessary packages. The packages and their versions are also listed inrequirements.txt
for alternate installation methods.
conda create -n "coref" python=3.10.8
conda activate coref
conda install numpy==1.23.5
conda install scipy==1.10.1
conda install conda-forge::h5py==3.8.0
conda install conda-forge::wandb==0.15.7
conda install conda-forge::networkx==3.2
conda install conda-forge::opencv==4.7.0
conda install pytorch==2.1.0 pytorch-cuda=11.8 -c pytorch -c nvidia
conda install -c pytorch -c nvidia faiss-gpu=1.7.2
- Next there are some of my tools which are intergrated into this repository. Running the following commands will install them in a
external
folder.
mkdir externals
cd externals
# ProcXD
git clone [email protected]:BardOfCodes/procXD.git
cd procXD
python setup.py install --user
cd ..
# Wacky
git clone [email protected]:BardOfCodes/wacky.git
cd wacky
python setup.py install --user
cd ..
# GeoLIPI
git clone [email protected]:BardOfCodes/geolipi.git
cd geolipi
export PYTHONPATH="$PYTHONPATH:$(pwd)"
cd ../..
- Download the required data here given as a zip file.
unzip data.zip
Additionally, you can download the checkpointed models from the same folder.
We can reproduce the results of the paper by following the following steps:
-
Configure the paths: The paths for the data and this repository must be defined in
configs/subconf/machine_spec.py
. -
Pretrain the model on synthetic data. This will run the pretrain, and the best model will be saved at
project_dir/models/pretrain/best_model.pt
.
python scripts/train.py --config-file configs/pretrain.py --name pretrain
- Train SIRI/PLAD with the following script. Let
$model_path
denote the path of the best pretrained model from step 1.
python scripts/train.py --config-file configs/siri.py --name siri --cfg.plad.starting_weights $model_path --cfg.ws_config.starting_weights $model_path
The best model will be saved as project_dir/models/siri/best_model.pt
.
- Run Evaluation/Inference. Inference (with rewriting) depends on two pickle files which are generated during the training.
$model_path
which is the "best" model on the validation set, and$subexpr_cache_path
which are all the subexpressions discovered during the code grafting process (The previous command would save it as/project_dir/models/siri/all_subexpr.pkl
). With these two files, you can run inference on the test set using:
python scripts/eval.py --config-file configs/eval.py --name eval --cfg.siri.rewriters.CGRewriter.cache_config.subexpr_load_path $subexpr_cache_path --cfg.trainer.load_weights $model_path
This repository currently supports reproducing results for PCSG3D. I will also be adding support for 2D CSG langauges, and the more complex MCSG3D as well. For ShapeAssembly, please use the old messy codebase here.
-
config
contains all the configuration files. I use my own config system wacky for this but its just a simple layer over Yacs. -
coref/language
shows how the different 2D/3D CSG based languages are constructed. They basically build over geolipi, and add state_machine and tokenizers etc. The synthetic dataset generator is also contained here. -
coref/dataloaders
contains the data-loaders. Also, the data-loaders support caching of programs in a quicker-to-execute form, which helps train faster. -
coref/model
contains the different neural networks trained in our experiments. -
coref/rewriters
contain the three different rewriters introduced in our work. -
coref/trainer
contains the classes which glue all the different parts of the system togeter to train as well as evaluate the models. -
notebooks
will containipynb
notebook(s) demonstrating how to use this method for new shapes, and how to visualize them.
The code based used during the development of this method is also open-sourced here. This code-base is much more messy and builds on top of stable-baselines as we initially modelled the problem as a RL problem. While this repository is meant to be the cleaner more general version of the method, there are few features of the older repository which I am yet to adapt:
-
Better synthetic pretraining data generation. Specifically, the synthetic data sampler implemented in the older repository keeps track of the bounding box of the expressions's execution, and samples transformations which keep it within the canvas (-1, 1)^3 space.
-
General expression inversion for Code grafting. This repository has a version curtailed to PCSG3D, but the older has a general which works on multiple languages.
- Add multi-processing beam search.
- Improve Code Grafting module.
- Add results and models for 2D Languages (PCSG/MCSG 3D) and complex MCSG 3D.
Thanks to my co-author Kenny Jones, and my advisor Daniel Ritchie! This work is supported by NSF award #1941808 and a Brown University Presidential Fellowship.