PASTEL

arXiv | IEEE Xplore | Website | Video

This repository is the official implementation of the paper:

A Good Foundation is Worth Many Labels: Label-Efficient Panoptic Segmentation

Niclas Vödisch*, Kürsat Petek*, Markus Käppeler*, Abhinav Valada, and Wolfram Burgard.
*Equal contribution.

IEEE Robotics and Automation Letters, vol. 10, issue 1, pp. 216-223, January 2025

If you find our work useful, please consider citing our paper:

@article{voedisch2025pastel,
  author={Vödisch, Niclas and Petek, Kürsat and Käppeler, Markus and Valada, Abhinav and Burgard, Wolfram},
  journal={IEEE Robotics and Automation Letters}, 
  title={A Good Foundation is Worth Many Labels: Label-Efficient Panoptic Segmentation}, 
  year={2025},
  volume={10},
  number={1},
  pages={216-223},
}

Make sure to also check out our previous work on this topic: SPINO.

📔 Abstract

A key challenge for the widespread application of learning-based models for robotic perception is to significantly reduce the required amount of annotated training data while achieving accurate predictions. This is essential not only to decrease operating costs but also to speed up deployment time. In this work, we address this challenge for PAnoptic SegmenTation with fEw Labels (PASTEL) by exploiting the groundwork paved by visual foundation models. We leverage descriptive image features from such a model to train two lightweight network heads for semantic segmentation and object boundary detection, using very few annotated training samples. We then merge their predictions via a novel fusion module that yields panoptic maps based on normalized cut. To further enhance the performance, we utilize self-training on unlabeled images selected by a feature-driven similarity scheme. We underline the relevance of our approach by employing PASTEL to important robot perception use cases from autonomous driving and agricultural robotics. In extensive experiments, we demonstrate that PASTEL significantly outperforms previous methods for label-efficient segmentation even when using fewer annotation.

👩‍💻 Code

🏗 Setup

⚙️ Installation

Create conda environment: conda create --name pastel python=3.8
Activate environment: conda activate pastel
Install dependencies: pip install -r requirements.txt
Install torch, torchvision and cuda: pip install torch==1.10.1+cu111 torchvision==0.11.2+cu111 torchaudio==0.10.1 -f https://download.pytorch.org/whl/cu111/torch_stable.html

💻 Development

Install pre-commit githook scripts: pre-commit install
Upgrade isort to 5.12.0: pip install isort
Update [pre-commit]: pre-commit autoupdate Linter (pylint) and formatter (yapf, iSort) settings can be set in pyproject.toml.

🎨 Running PASTEL

Generating pseudo-labels with PASTEL involves three steps:

Train the semantic segmentation module.
Train the boundary estimation module.
Generate pseudo-labels using the fusion module.

For Cityscapes, an exemplary execution would look like this:

conda activate pastel
python semantic_fine_tuning.py fit --trainer.devices [0] --config configs/cityscapes_semantics.yaml
python boundary_fine_tuning.py fit --trainer.devices [0] --config configs/cityscapes_boundary.yaml
python instance_clustering.py test --trainer.devices [0,1,2,3] --config configs/cityscapes_instance_ncut.yaml

We provide configuration files for each step of all datasets in the configs folder. Please make sure to double-check the paths to the datasets and the pretrained weights.

🏋️ Pre-trained weights

We provide the following pre-trained weights:

Cityscapes:
PASCAL VOC:
PhenoBench:
- Semantic segmentation
- Boundary estimation

⚠️ If your browser blocks the download, right-click on the link and copy the address to download the file manually.

💾 Datasets

Cityscapes

Download the following files:

leftImg8bit_sequence_trainvaltest.zip (324GB)
gtFine_trainvaltest.zip (241MB)
camera_trainvaltest.zip (2MB)

After extraction, one should obtain the following file structure:

── data/cityscapes
   ├── camera
   │    └── ...
   ├── gtFine
   │    └── ...
   └── leftImg8bit_sequence
        └── ...

PASCAL VOC

We use the 2012 challenge plus the SBD extension.
Upon execution, the files should be automatically downloaded from torchvision.

Afterward, one should obtain the following file structure:

── data/pascal_voc
   ├── SBD
   │    └── ...
   └── VOCdevkit/VOC2012
        └── ...

PhenoBench

We use the leaf instance segmentation challenge.
Please download the dataset from the official website.

After extraction, one should obtain the following file structure:

── data/phenobench
   ├── test
   │    └── images
   ├── train
   │    ├── images
   │    ├── leaf_instances
   │    ├── leaf_visibility
   │    ├── plant_instances
   │    ├── plant_visibility
   │    └── semantics
   └── val
        ├── images
        ├── leaf_instances
        ├── leaf_visibility
        ├── plant_instances
        ├── plant_visibility
        └── semantics

👩‍⚖️ License

For academic usage, the code is released under the GPLv3 license. For any commercial purpose, please contact the authors.

🙏 Acknowledgment

This work was funded by the German Research Foundation (DFG) Emmy Noether Program grant No 468878300.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

PASTEL

📔 Abstract

👩‍💻 Code

🏗 Setup

⚙️ Installation

💻 Development

🎨 Running PASTEL

🏋️ Pre-trained weights

💾 Datasets

Cityscapes

PASCAL VOC

PhenoBench

👩‍⚖️ License

🙏 Acknowledgment

Files

README.md

Latest commit

History

README.md

File metadata and controls

PASTEL

📔 Abstract

👩‍💻 Code

🏗 Setup

⚙️ Installation

💻 Development

🎨 Running PASTEL

🏋️ Pre-trained weights

💾 Datasets

Cityscapes

PASCAL VOC

PhenoBench

👩‍⚖️ License

🙏 Acknowledgment