Add falcon gguf #33437

g-prz · 2024-09-11T12:43:16Z

What does this PR do?

This PR adds GGUF for Falcon model

Contributes to #33260

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

g-prz · 2024-09-18T14:31:59Z

Hey @SunMarc 🙋‍♂️
I am getting more confident that this PR is ready for review.
My last concern, I haven't checked the test for falcon 40b as it is way to big to run it locally, so I copied the expected result from the 7b model. Is there any way to directly test the 40b model?

Also the CI fails but it seems like it is not related to any of the work I have done (mostly due to accelerate?)

g-prz · 2024-09-30T09:08:02Z

Hey @SunMarc 😁
I rebased this PR
I still have this question mark regarding the output of falcon 40b in the new test as I don't have enough RAM to try it out.

SunMarc · 2024-09-30T14:27:53Z

Hey @SunMarc 😁
I rebased this PR
I still have this question mark regarding the output of falcon 40b in the new test as I don't have enough RAM to try it out.

You don't need to add the 40B test. The model will be too big for our ci

Can you add a test to check that the fp16 gguf model have the same weights + expected output as the fp16 transforrmers model ? You can check that the bloom gguf tests as reference

tests/quantization/ggml/test_ggml.py

SunMarc

Thanks for the addition ! Just a few nits

src/transformers/modeling_gguf_pytorch_utils.py

tests/quantization/ggml/test_ggml.py

src/transformers/modeling_gguf_pytorch_utils.py

HuggingFaceDocBuilderDev · 2024-10-01T16:18:54Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

LysandreJik

Awesome, thank you!

* feat(gguf): add falcon q2 k * fix(gguf): remove useless renaming * feat(gguf): seperate falcon 7b and 40b * feat(gguf): apply fixup * fix(test): error rebase * feat(gguf): add fp16 weight comparison for falcon * feat(gguf): test weight of all layers * test(gguf): add falcon 40b under skip decorator * feat(gguf): quick example for extracting model size

g-prz force-pushed the add-falcon-gguf branch 3 times, most recently from 249b730 to aeca29c Compare September 17, 2024 09:24

g-prz force-pushed the add-falcon-gguf branch from 4bf20eb to d28372e Compare September 18, 2024 14:58

g-prz marked this pull request as ready for review September 19, 2024 07:35

g-prz added 4 commits September 30, 2024 11:48

feat(gguf): add falcon q2 k

7b5a2bf

fix(gguf): remove useless renaming

88be26d

feat(gguf): seperate falcon 7b and 40b

eb95675

feat(gguf): apply fixup

1fbccf6

g-prz force-pushed the add-falcon-gguf branch from d28372e to 1fbccf6 Compare September 30, 2024 08:53

fix(test): error rebase

5ed82b1

feat(gguf): add fp16 weight comparison for falcon

e6cf872

g-prz commented Oct 1, 2024

View reviewed changes

tests/quantization/ggml/test_ggml.py Outdated Show resolved Hide resolved

feat(gguf): test weight of all layers

8492d00

SunMarc requested a review from LysandreJik October 1, 2024 14:55

SunMarc approved these changes Oct 1, 2024

View reviewed changes

src/transformers/modeling_gguf_pytorch_utils.py Show resolved Hide resolved

tests/quantization/ggml/test_ggml.py Show resolved Hide resolved

src/transformers/modeling_gguf_pytorch_utils.py Outdated Show resolved Hide resolved

g-prz added 2 commits October 1, 2024 18:32

test(gguf): add falcon 40b under skip decorator

124f342

feat(gguf): quick example for extracting model size

4b3b449

LysandreJik approved these changes Oct 2, 2024

View reviewed changes

LysandreJik merged commit fe48472 into huggingface:main Oct 2, 2024
24 checks passed

SunMarc mentioned this pull request Oct 2, 2024

Community contribution: Adding GGUF support for more architectures #33260

Open

15 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add falcon gguf #33437

Add falcon gguf #33437

g-prz commented Sep 11, 2024 •

edited

Loading

g-prz commented Sep 18, 2024 •

edited

Loading

g-prz commented Sep 30, 2024

SunMarc commented Sep 30, 2024

SunMarc left a comment

HuggingFaceDocBuilderDev commented Oct 1, 2024

LysandreJik left a comment

Add falcon gguf #33437

Add falcon gguf #33437

Conversation

g-prz commented Sep 11, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

g-prz commented Sep 18, 2024 • edited Loading

g-prz commented Sep 30, 2024

SunMarc commented Sep 30, 2024

SunMarc left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Oct 1, 2024

LysandreJik left a comment

Choose a reason for hiding this comment

g-prz commented Sep 11, 2024 •

edited

Loading

g-prz commented Sep 18, 2024 •

edited

Loading