Text-prompted Video Segmentation via GroundingDINO and SAM2

How to run

You can run this program by running python main.py.

Before that, please visit GroundingDINO and SAM2 to prepare the environment according to their instructions. Also, please download the checkpoints for SAM2 and GroundingDINO separately. You should place the checkpoint of SAM2 under /checkpoints/sam2.1_hiera_large.pt and the checkpoints of GroundingDINO under /weights/groundingdino_swinb_cogcoor.pth.

In addition to that, you only need to install the environment required for the UI by pip install gradio, and downloading bert-base-uncased from huggingface to your root directory of this project will be recommended.

Tips

Avoid to upload a long video, because it will lead to a very long inference time. A video of 100 frames takes about 8 minutes. Basically, this is a tool for semantic labeling jobs.

Use keyword,keyword,... as your text prompt rather than a long sentence.

What special

The official project of Grounded-SAM-2 is a simple implementation of this project. In this project, we optimized in the following aspects:

GroundingDINO only involves a single keyword for each inference, significantly reducing missed detection
Searching grounded objects in all frames
Instances detected by GroundingDINO are searched across the entire video
Better mask post-processing

Contact

Experienced developers are welcome to collaborate with me on this project. If you are interested, please send an email to [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
build		build
checkpoints		checkpoints
groundingdino.egg-info		groundingdino.egg-info
groundingdino		groundingdino
sam2		sam2
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Text-prompted Video Segmentation via GroundingDINO and SAM2

How to run

Tips

What special

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Zihan-Liu-00/GroundingDINO-SAM2

Folders and files

Latest commit

History

Repository files navigation

Text-prompted Video Segmentation via GroundingDINO and SAM2

How to run

Tips

What special

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages