GitHub - Tapall-AI/MeViS_Track_Solution_2024: [CVPR 2024 Challenge] 1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation

1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation

Mingqi Gao^1,4,+, Jingnan Luo^2,+, Jinyu Yang^1,*, Jungong Han^3,4, Feng Zheng^1,2,*

¹ Tapall.ai ² Southern University of Science and Technology ³ University of Sheffield ⁴ University of Warwick

⁺ Equal Contributions, ^* Corresponding Authors

📍 Installation

We test the code in the following environments, other versions may also be compatible: Python=3.9, PyTorch=1.10.1, CUDA=11.3

pip install -r requirements.txt
pip install 'git+https://github.com/facebookresearch/fvcore' 
pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
cd models/ops
python setup.py build install
cd ../..

📍 Training

Download MUTR's checkpoint from HERE (Swin-L, joint-training on Ref-COCO series and Ref-YouTube-VOS).
Run following commands to fine-tune MUTR on MeViS:

python -m torch.distributed.launch --nproc_per_node 1 --master_port 10010 --use_env train.py --freeze_text_encoder --with_box_refine --binary --dataset_file mevis --epochs 2 --lr_drop 1 --resume [MUTR checkpoint] --output_dir [output path] --mevis_path [MeViS path] --backbone swin_l_p4w7

📍 Inference

Our checkpoint is available on Google Drive.

python inference_mevis.py --with_box_refine --binary --freeze_text_encoder --output_dir [output path] --resume [checkpoint path] --ngpu 1 --batch_size 1 --backbone swin_l_p4w7 --mevis_path [MeViS path] --split valid --sub_video_len 30 --no_sampling (optional, no sampling mode)

📖 Citation

If you find our solution useful for your research, please consider citing with this BibTeX:

@misc{gao20241st,
      title={1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation}, 
      author={Mingqi Gao and Jingnan Luo and Jinyu Yang and Jungong Han and Feng Zheng},
      year={2024},
      eprint={2406.07043},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

🙌 Acknowledgement

The solution is based on MUTR and MeViS. Thanks for the authors for their efforts.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
datasets		datasets
docs		docs
models		models
tools		tools
util		util
.gitignore		.gitignore
README.md		README.md
engine.py		engine.py
inference_mevis.py		inference_mevis.py
opts.py		opts.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation

📍 Installation

📍 Training

📍 Inference

📖 Citation

🙌 Acknowledgement

About

Releases

Packages

Languages

Tapall-AI/MeViS_Track_Solution_2024

Folders and files

Latest commit

History

Repository files navigation

1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation

📍 Installation

📍 Training

📍 Inference

📖 Citation

🙌 Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages