Weakly supervised instance segmentation on S3DIS and ScanNet

Abstract of this work

This work presents a general and simple framework to tackle point clouds understanding when labels are limited. The first contribution is that we have done extensive methodology comparisons of traditional and learnt 3D descriptors for the task of weakly supervised 3D scene understanding, and validated that our adapted traditional PFH-based 3D descriptors show excellent generalization ability across different domains. The second contribution is that we proposed a learning-based region merging strategy based on the affinity provided by both the traditional/learnt 3D descriptors and learnt semantics. The merging process takes both low-level geometric and high-level semantic feature correlations into consideration. Experimental results demonstrate that our framework has the best performance among the three most important weakly supervised point clouds understanding tasks including semantic segmentation, instance segmentation, and object detection.

News

20 November 2022: All main Codes and models are released!

Code structure

Our RM3D adapts the structure of the codebase Mix3D which provides a highly modularized framework for 3D Semantic Segmentation based on the MinkowskiEngine.

├── rm3d
│   ├── main_instance_segmentation.py <- the main file
│   ├── conf                          <- hydra configuration files
│   ├── datasets
│   │   ├── preprocessing             <- folder with preprocessing scripts
│   │   ├── semseg.py                 <- indoor dataset
│   │   └── utils.py        
│   ├── models                        <- RM3D modules
│   ├── trainer
│   │   ├── __init__.py
│   │   └── trainer.py                <- train loop
│   └── utils
├── data
│   ├── processed                     <- folder for preprocessed datasets
│   └── raw                           <- folder for raw datasets
├── scripts                           <- train scripts
├── docs
├── README.md
└── saved                             <- folder that stores models and logs

Dependencies 📝

The main dependencies of the project are the following:

python: 3.10.6
cuda: 11.6

You can set up a conda environment as follows

conda create --name=RM3D python=3.10.6
conda activate RM3D

conda update -n base -c defaults conda
conda install openblas-devel -c anaconda

pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu116
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.12.1+cu116.html

pip install ninja==1.10.2.3
pip install pytorch-lightning fire imageio tqdm wandb python-dotenv pyviz3d scipy plyfile scikit-learn trimesh loguru albumentations volumentations

pip install antlr4-python3-runtime==4.8
pip install black==21.4b2
pip install omegaconf==2.0.6 hydra-core==1.0.5 --no-deps
pip install 'git+https://github.com/facebookresearch/detectron2.git@710e7795d0eeadf9def0e7ef957eea13532e34cf' --no-deps

cd third_party/pointnet2 && python setup.py install

Data preprocessing 🔨

After installing the dependencies, we preprocess the datasets.

ScanNet

First, we apply Felzenswalb and Huttenlocher's Graph Based Image Segmentation algorithm to the test scenes using the default parameters. Please refer to the original repository for details. Put the resulting segmentations in ./data/raw/scannet_test_segments.

python datasets/preprocessing/scannet_preprocessing.py preprocess \
--data_dir="PATH_TO_RAW_SCANNET_DATASET" \
--save_dir="../../data/processed/scannet" \
--git_repo="PATH_TO_SCANNET_GIT_REPO" \

S3DIS

The S3DIS dataset contains some smalls bugs which we initially fixed manually. We will soon release a preprocessing script which directly preprocesses the original dataset. For the time being, please follow the instructions here to fix the dataset manually. Afterwards, call the preprocessing script as follows:

python datasets/preprocessing/s3dis_preprocessing.py preprocess \
--data_dir="PATH_TO_Stanford3dDataset_v1.2" \
--save_dir="../../data/processed/s3dis"

Training and testing

Train RM3D on the ScanNet dataset:

python main_instance_segmentation.py

Please refer to the config scripts (for example here) for detailed instructions how to reproduce our results. In the simplest case the inference command looks as follows:

python main_instance_segmentation.py \
general.checkpoint='PATH_TO_CHECKPOINT.ckpt' \
general.train_mode=false

Trained checkpoints 💾

We provide detailed scores and network configurations with trained checkpoints.

S3DIS (pretrained on ScanNet train+val)

Following PointGroup, HAIS and SoftGroup, we finetune a model pretrained on ScanNet. Here we provided the models trained with 1% labels. More circumstances with diverse labeling percentage will be provided. Please stay tuned.

Dataset	AP_50	Config	Checkpoint 💾
Area 1	54.9	config	checkpoint
Area 2	53.6	config	checkpoint
Area 3	51.7	config	checkpoint
Area 4	58.9	config	checkpoint
Area 5	57.6	config	checkpoint
Area 6	56.2	config	checkpoint

ScanNet v2

Here we provided the models trained with 1% labels on ScanNet v2. More circumstances with diverse labeling percentage will be provided. Please stay tuned.

Dataset	AP_50	Config	Models 💾
ScanNet val	55.7	config	checkpoint

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Weakly supervised instance segmentation on S3DIS and ScanNet

Abstract of this work

News

Code structure

Dependencies 📝

Data preprocessing 🔨

ScanNet

S3DIS

Training and testing

Trained checkpoints 💾

S3DIS (pretrained on ScanNet train+val)

ScanNet v2

Files

README.md

Latest commit

History

README.md

File metadata and controls

Weakly supervised instance segmentation on S3DIS and ScanNet

Abstract of this work

News

Code structure

Dependencies 📝

Data preprocessing 🔨

ScanNet

S3DIS

Training and testing

Trained checkpoints 💾

S3DIS (pretrained on ScanNet train+val)

ScanNet v2