Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add falcon gguf #33437

Merged
merged 9 commits into from
Oct 2, 2024
Merged

Add falcon gguf #33437

merged 9 commits into from
Oct 2, 2024

Conversation

g-prz
Copy link
Contributor

@g-prz g-prz commented Sep 11, 2024

What does this PR do?

This PR adds GGUF for Falcon model

Contributes to #33260

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@g-prz g-prz force-pushed the add-falcon-gguf branch 3 times, most recently from 249b730 to aeca29c Compare September 17, 2024 09:24
@g-prz
Copy link
Contributor Author

g-prz commented Sep 18, 2024

Hey @SunMarc 🙋‍♂️
I am getting more confident that this PR is ready for review.
My last concern, I haven't checked the test for falcon 40b as it is way to big to run it locally, so I copied the expected result from the 7b model. Is there any way to directly test the 40b model?

Also the CI fails but it seems like it is not related to any of the work I have done (mostly due to accelerate?)

@g-prz g-prz marked this pull request as ready for review September 19, 2024 07:35
@g-prz
Copy link
Contributor Author

g-prz commented Sep 30, 2024

Hey @SunMarc 😁
I rebased this PR
I still have this question mark regarding the output of falcon 40b in the new test as I don't have enough RAM to try it out.

@SunMarc
Copy link
Member

SunMarc commented Sep 30, 2024

Hey @SunMarc 😁
I rebased this PR
I still have this question mark regarding the output of falcon 40b in the new test as I don't have enough RAM to try it out.

You don't need to add the 40B test. The model will be too big for our ci

Can you add a test to check that the fp16 gguf model have the same weights + expected output as the fp16 transforrmers model ? You can check that the bloom gguf tests as reference

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the addition ! Just a few nits

tests/quantization/ggml/test_ggml.py Show resolved Hide resolved
src/transformers/modeling_gguf_pytorch_utils.py Outdated Show resolved Hide resolved
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thank you!

@LysandreJik LysandreJik merged commit fe48472 into huggingface:main Oct 2, 2024
24 checks passed
NielsRogge pushed a commit to NielsRogge/transformers that referenced this pull request Oct 21, 2024
* feat(gguf): add falcon q2 k

* fix(gguf): remove useless renaming

* feat(gguf): seperate falcon 7b and 40b

* feat(gguf): apply fixup

* fix(test): error rebase

* feat(gguf): add fp16 weight comparison for falcon

* feat(gguf): test weight of all layers

* test(gguf): add falcon 40b under skip decorator

* feat(gguf): quick example for extracting model size
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants