Model Zoo

LLaRA Model Zoo

llava-1.5-7b-D-inBC + Aux(B) trained on VIMA-80k (Hugging Face) Note: The old files of this model were incorrect. If you downloaded this model before July 13, 2024, please download it again. We apologize for the confusion.
llava-1.5-7b-D-inBC + Aux(D) trained on VIMA-80k (Hugging Face)
llava-1.5-7b-D-inBC trained on VIMA-80k (Hugging Face)
llava-1.5-7b-D-RT2-Style trained on VIMA-80k (Hugging Face)

We are happy to share any models tested in this paper upon request. If you are interested in a specific model from our paper, please contact us.

VIMA-0.8k	VIMA-8k	VIMA-80k
Google Drive	Google Drive	Google Drive

All the object detection models are also available at Hugging Face.