Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idefics3 Addition #379

Merged
merged 28 commits into from
Aug 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
d2c7606
VILA added
Jul 25, 2024
9a0e0ba
Merge branch 'main' into main
junming-yang Jul 25, 2024
58b2f24
Update README.md
junming-yang Jul 25, 2024
96783ef
fix merge conflict
amitbcp Jul 25, 2024
004614b
resolve config merge conflict
amitbcp Jul 25, 2024
d3149dc
Fix error on Idefics for longer prompt
amitbcp Jul 25, 2024
ee7de8a
Merge branch 'open-compass:main' into main
amitbcp Jul 27, 2024
30523a4
Fix naming convention to make consistent with Idefics2 and better rea…
amitbcp Jul 27, 2024
595eb5f
Merge branch 'main' of https://github.com/amitbcp/VLMEvalKit into main
amitbcp Jul 27, 2024
3695566
Merge branch 'open-compass:main' into main
amitbcp Jul 27, 2024
072f20c
update config for idefics
amitbcp Jul 27, 2024
63561bb
Make LLava consistent as well
amitbcp Jul 27, 2024
2a8486e
Merge branch 'open-compass:main' into main
amitbcp Jul 29, 2024
2ca4d1d
Add VILA 1.5 3B
amitbcp Jul 29, 2024
da9eedd
Add VILA 1.5 3B
amitbcp Jul 29, 2024
28e33dc
fix naming convention to be similar to the HF models
amitbcp Jul 29, 2024
0c0cee5
update main from open-compass
amitbcp Jul 30, 2024
7f18c91
Multi-Turn added for Phi3-Vision and tested with MMDU
amitbcp Jul 30, 2024
e2a15bc
Merge branch 'open-compass:main' into main
amitbcp Aug 3, 2024
cd433af
Merge branch 'open-compass:main' into main
amitbcp Aug 6, 2024
6bc24fc
Merge branch 'open-compass:main' into main
amitbcp Aug 9, 2024
baaea5e
Add multi turn for Intern VL
amitbcp Aug 11, 2024
61f1df0
fix formatting
amitbcp Aug 11, 2024
75c23df
Add Idefics3 Config
amitbcp Aug 12, 2024
2474225
Warning message to build from source
amitbcp Aug 12, 2024
015d7d3
Merge branch 'open-compass:main' into main
amitbcp Aug 20, 2024
07d8788
Remove conflict from Readme.md
amitbcp Aug 20, 2024
377de7f
Merge branch 'main' into idefics3
kennymckormick Aug 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,9 +90,9 @@ VLMEvalKit will use a **judge LLM** to extract answer from the output if you set

**Supported PyTorch / HF Models**

| [**IDEFICS-[9B/80B/v2-8B]-Instruct**](https://huggingface.co/HuggingFaceM4/idefics-9b-instruct)🎞️🚅 | [**InstructBLIP-[7B/13B]**](https://github.com/salesforce/LAVIS/blob/main/projects/instructblip/README.md) | [**LLaVA-[v1-7B/v1.5-7B/v1.5-13B]**](https://github.com/haotian-liu/LLaVA) | [**MiniGPT-4-[v1-7B/v1-13B/v2-7B]**](https://github.com/Vision-CAIR/MiniGPT-4) |
| [**IDEFICS-[9B/80B/v2-8B/v3-8B]-Instruct**](https://huggingface.co/HuggingFaceM4/idefics-9b-instruct)🚅🎞️ | [**InstructBLIP-[7B/13B]**](https://github.com/salesforce/LAVIS/blob/main/projects/instructblip/README.md) | [**LLaVA-[v1-7B/v1.5-7B/v1.5-13B]**](https://github.com/haotian-liu/LLaVA) | [**MiniGPT-4-[v1-7B/v1-13B/v2-7B]**](https://github.com/Vision-CAIR/MiniGPT-4) |
| ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
| [**mPLUG-Owl2**](https://github.com/X-PLUG/mPLUG-Owl/tree/main/mPLUG-Owl2)🎞️ | [**OpenFlamingo-v2**](https://github.com/mlfoundations/open_flamingo)🎞️ | [**PandaGPT-13B**](https://github.com/yxuansu/PandaGPT) | [**Qwen-VL**](https://huggingface.co/Qwen/Qwen-VL)🎞️🚅, [**Qwen-VL-Chat**](https://huggingface.co/Qwen/Qwen-VL-Chat)🎞️**🚅** |
| [**mPLUG-Owl2**](https://github.com/X-PLUG/mPLUG-Owl/tree/main/mPLUG-Owl2)🎞️ | [**OpenFlamingo-v2**](https://github.com/mlfoundations/open_flamingo)🎞️ | [**PandaGPT-13B**](https://github.com/yxuansu/PandaGPT) | [**Qwen-VL**](https://huggingface.co/Qwen/Qwen-VL)🚅🎞️ , [**Qwen-VL-Chat**](https://huggingface.co/Qwen/Qwen-VL-Chat)🚅🎞️ |
| [**VisualGLM-6B**](https://huggingface.co/THUDM/visualglm-6b)🚅 | [**InternLM-XComposer-[1/2]**](https://huggingface.co/internlm/internlm-xcomposer-7b)🚅 | [**ShareGPT4V-[7B/13B]**](https://sharegpt4v.github.io)🚅 | [**TransCore-M**](https://github.com/PCIResearch/TransCore-M) |
| [**LLaVA (XTuner)**](https://huggingface.co/xtuner/llava-internlm-7b)🚅 | [**CogVLM-[Chat/Llama3]**](https://huggingface.co/THUDM/cogvlm-chat-hf)🚅 | [**ShareCaptioner**](https://huggingface.co/spaces/Lin-Chen/Share-Captioner)🚅 | [**CogVLM-Grounding-Generalist**](https://huggingface.co/THUDM/cogvlm-grounding-generalist-hf)🚅 |
| [**Monkey**](https://github.com/Yuliang-Liu/Monkey)🚅, [**Monkey-Chat**](https://github.com/Yuliang-Liu/Monkey)🚅 | [**EMU2-Chat**](https://github.com/baaivision/Emu)🚅🎞️ | [**Yi-VL-[6B/34B]**](https://huggingface.co/01-ai/Yi-VL-6B) | [**MMAlaya**](https://huggingface.co/DataCanvas/MMAlaya)🚅 |
Expand All @@ -117,7 +117,7 @@ Note that some VLMs may not be able to run under certain transformer versions, w
- **Please use** `transformers==4.33.0` **for**: `Qwen series`, `Monkey series`, `InternLM-XComposer Series`, `mPLUG-Owl2`, `OpenFlamingo v2`, `IDEFICS series`, `VisualGLM`, `MMAlaya`, `ShareCaptioner`, `MiniGPT-4 series`, `InstructBLIP series`, `PandaGPT`, `VXVERSE`, `GLM-4v-9B`.
- **Please use** `transformers==4.37.0` **for**: `LLaVA series`, `ShareGPT4V series`, `TransCore-M`, `LLaVA (XTuner)`, `CogVLM Series`, `EMU2 Series`, `Yi-VL Series`, `MiniCPM-[V1/V2]`, `OmniLMM-12B`, `DeepSeek-VL series`, `InternVL series`, `Cambrian Series`, `VILA Series`, `Llama-3-MixSenseV1_1`, `Parrot-7B`.
- **Please use** `transformers==4.40.0` **for**: `IDEFICS2`, `Bunny-Llama3`, `MiniCPM-Llama3-V2.5`, `360VL-70B`, `Phi-3-Vision`, `WeMM`.
- **Please use** `transformers==latest` **for**: `LLaVA-Next series`, `PaliGemma-3B`, `Chameleon series`, `Video-LLaVA-7B-HF`, `Ovis series`, `Mantis series`, `MiniCPM-V2.6`, `OmChat-v2.0-13B-sinlge-beta`.
- **Please use** `transformers==latest` **for**: `LLaVA-Next series`, `PaliGemma-3B`, `Chameleon series`, `Video-LLaVA-7B-HF`, `Ovis series`, `Mantis series`, `MiniCPM-V2.6`, `OmChat-v2.0-13B-sinlge-beta`, `Idefics-3`.

```python
# Demo
Expand Down
4 changes: 4 additions & 0 deletions vlmeval/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,10 @@
'idefics_9b_instruct': partial(IDEFICS, model_path='HuggingFaceM4/idefics-9b-instruct'),
'idefics_80b_instruct': partial(IDEFICS, model_path='HuggingFaceM4/idefics-80b-instruct'),
'idefics2_8b': partial(IDEFICS2, model_path='HuggingFaceM4/idefics2-8b'),

# Idefics3 follows Idefics2 Pattern
'Idefics3-8B-Llama3': partial(IDEFICS2, model_path='HuggingFaceM4/Idefics3-8B-Llama3'),

}

instructblip_series = {
Expand Down
3 changes: 3 additions & 0 deletions vlmeval/vlm/idefics.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,9 @@ class IDEFICS2(BaseModel):
def __init__(self, model_path='HuggingFaceM4/idefics2-8b', **kwargs):
assert model_path is not None
self.model_path = model_path
if 'Idefics3' in self.model_path.lower():
warnings.warn('Install transfomers from source: PR https://github.com/open-compass/VLMEvalKit/pull/379')
warnings.warn('Reference: https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3')
self.processor = AutoProcessor.from_pretrained(model_path)
self.model = AutoModelForVision2Seq.from_pretrained(
model_path,
Expand Down
Loading