Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Model]Refactor MiniCPMV #7020

Merged
merged 37 commits into from
Aug 4, 2024
Merged

Conversation

jeejeelee
Copy link
Contributor

I have completed the following modification:

  • Separate different versions of MiniCPMV to facilitate support for future features like LoRA and BNB.
  • Port Idefics2VisionTransformer

ping @ywang96 @DarkLight1337 @HwwwwwwwH

Copy link

github-actions bot commented Aug 1, 2024

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which consists a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of default ones by unblocking the steps in your fast-check build on Buildkite UI.

Once the PR is approved and ready to go, please make sure to run full CI as it is required to merge (or just use auto-merge).

To run full CI, you can do one of these:

  • Comment /ready on the PR
  • Add ready label to the PR
  • Enable auto-merge.

🚀

@ywang96
Copy link
Member

ywang96 commented Aug 1, 2024

Hey @jeejeelee Thanks for the contribution! Were you able to verify if the model works on TP=1,2,4,8? (I will verify this myself later, but I was curious if this PR is ready for testing)

Also I'm curious if you have seen any significant speedup by sharding the ViT

@jeejeelee
Copy link
Contributor Author

jeejeelee commented Aug 1, 2024

Hey @jeejeelee Thanks for the contribution! Were you able to verify if the model works on TP=1,2,4,8? (I will verify this myself later, but I was curious if this PR is ready for testing)

Also I'm curious if you have seen any significant speedup by sharding the ViT

Just verified on TP=1,2,4 . Currently, I do not have available 8-gpu resources and have not yet verified TP=8

@DarkLight1337
Copy link
Member

DarkLight1337 commented Aug 1, 2024

This will cause significant merge conflicts with #6995. Would it be better for you to incorporate my changes into your PR instead of releasing them separately?

@ywang96
Copy link
Member

ywang96 commented Aug 1, 2024

This will cause significant merge conflicts with #6995. Would it be better for you to incorporate my changes into your PR instead of releasing them separately?

I agree with this too.

@jeejeelee
Copy link
Contributor Author

This will cause significant merge conflicts with #6995. Would it be better for you to incorporate my changes into your PR instead of releasing them separately?

Ok, I will incorporate your changes ASAP

@jeejeelee
Copy link
Contributor Author

Also I'm curious if you have seen any significant speedup by sharding the ViT

@ywang96 I have not yet tested the speedup effect of TP, but I will provide test results once available.

@jeejeelee
Copy link
Contributor Author

@DarkLight1337 It seems #6995 has a bug. After incorporating your changes, the generated results have become poor.

@DarkLight1337
Copy link
Member

Try reverting the lines where get_2d_sincos_pos_embed is called.

@jeejeelee
Copy link
Contributor Author

Try reverting the lines where get_2d_sincos_pos_embed is called.

I have figured it out, this snippet has a bug, and have fixed it

@DarkLight1337
Copy link
Member

DarkLight1337 commented Aug 1, 2024

Try reverting the lines where get_2d_sincos_pos_embed is called.

I have figured it out, this snippet has a bug, and have fixed it

My bad, I forgot to rename the intermediate variables. Thanks for fixing this!

To avoid confusion, I have closed the other PR.

@jeejeelee
Copy link
Contributor Author

Try reverting the lines where get_2d_sincos_pos_embed is called.

I have figured it out, this snippet has a bug, and have fixed it

My bad, I forgot to rename the intermediate variables. Thanks for fixing this!

To avoid confusion, I have closed the other PR.

Thanks, could you help review my implementation? I want to complete this PR asap. My final goal is actually to make minicpmv2.5 support LoRA

@HwwwwwwwH
Copy link
Contributor

Sry for late. Really appreciate your contribution! I'll check these modifications.

@HwwwwwwwH
Copy link
Contributor

I think it's truly great to have MiniCPMV separated. I'm pulling the code of this PR and running some evaluations.

@DarkLight1337 DarkLight1337 mentioned this pull request Aug 1, 2024
vllm/model_executor/models/minicpmv.py Outdated Show resolved Hide resolved
vllm/model_executor/models/minicpmv.py Show resolved Hide resolved
vllm/model_executor/models/minicpmv.py Outdated Show resolved Hide resolved
@jeejeelee
Copy link
Contributor Author

@DarkLight1337 After updating the latest changes, I'm still encountering errors

[rank0]:   File "/mypath/vllm/vllm/model_executor/models/minicpmv.py", line 588, in _parse_and_validate_inputs
[rank0]:     raise ValueError(f"Inconsistent flattened lengths, found: {lens}")
[rank0]: ValueError: Inconsistent flattened lengths, found: [0, 16, 16]

@jeejeelee
Copy link
Contributor Author

@DarkLight1337 @HwwwwwwwH It's getting late here, so I'll log off for now. Thank you for all your hard work

@DarkLight1337
Copy link
Member

@DarkLight1337 @HwwwwwwwH It's getting late here, so I'll log off for now. Thank you for all your hard work

Sure, sorry for interfering with your own testing...

@DarkLight1337
Copy link
Member

@DarkLight1337 After updating the latest changes, I'm still encountering errors

[rank0]:   File "/mypath/vllm/vllm/model_executor/models/minicpmv.py", line 588, in _parse_and_validate_inputs
[rank0]:     raise ValueError(f"Inconsistent flattened lengths, found: {lens}")
[rank0]: ValueError: Inconsistent flattened lengths, found: [0, 16, 16]

After some offline discussion with @HwwwwwwwH , apparently the dummy data doesn't contain image tokens while providing the image. I have updated the validation to allow this for now, we will revisit the dummy data generation in a later PR.

@HwwwwwwwH
Copy link
Contributor

te here, so I'll log off for now. Thank you for all your hard work

Thank you too. Good night!

@HwwwwwwwH
Copy link
Contributor

@jeejeelee Hi there. Could you run the model correctly now? I got some problems.

@jeejeelee
Copy link
Contributor Author

@jeejeelee Hi there. Could you run the model correctly now? I got some problems.

Me too.

@HwwwwwwwH
Copy link
Contributor

@DarkLight1337 Thank you for your hard work. It works fine for me now.

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) August 4, 2024 07:24
@DarkLight1337 DarkLight1337 merged commit 179a6a3 into vllm-project:main Aug 4, 2024
66 of 67 checks passed
@jeejeelee jeejeelee deleted the refactor-minicpmv branch August 4, 2024 23:35
@ywang96
Copy link
Member

ywang96 commented Aug 6, 2024

Hello @jeejeelee! Just a follow-up question: are you interested in implementing Idefics3 eventually? (It's not available on transformers yet since PR for this model is still WIP)

@jeejeelee
Copy link
Contributor Author

Hello @jeejeelee! Just a follow-up question: are you interested in implementing Idefics3 eventually? (It's not available on transformers yet since PR for this model is still WIP)

I'd be happy to implement this, but I might not be able to start working on it until next week.

@ywang96
Copy link
Member

ywang96 commented Aug 7, 2024

Hello @jeejeelee! Just a follow-up question: are you interested in implementing Idefics3 eventually? (It's not available on transformers yet since PR for this model is still WIP)

I'd be happy to implement this, but I might not be able to start working on it until next week.

@jeejeelee No rush at all and thank you for the interest. Most likely we'll have to wait for the transformers PR to get into a better shape so we can verify model correctness anyways.

sfc-gh-mkeralapura pushed a commit to sfc-gh-mkeralapura/vllm that referenced this pull request Aug 12, 2024
kylesayrs pushed a commit to neuralmagic/vllm that referenced this pull request Aug 17, 2024
Alvant pushed a commit to compressa-ai/vllm that referenced this pull request Oct 26, 2024
Co-authored-by: Cyrus Leung <[email protected]>
Signed-off-by: Alvant <[email protected]>
@jeejeelee jeejeelee mentioned this pull request Oct 28, 2024
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants