Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Good First Issue]: Create a GGUF reader #1665

Open
AlexKoff88 opened this issue Feb 3, 2025 · 2 comments
Open

[Good First Issue]: Create a GGUF reader #1665

AlexKoff88 opened this issue Feb 3, 2025 · 2 comments
Labels
good first issue Good for newcomers

Comments

@AlexKoff88
Copy link
Collaborator

The idea is to have a functionality that allows reading GGUF format and creating OpenVINO GenAI compatible representation that can be used to instantiate LLMPipeline() from it.
This task includes:

The initial scope can include support of llama-based LLMs (e.g. llama-3.2 and SmoLMs) and FP16, Q8_0, Q4_0, Q4_1 models.
All the code should be written in C++.

@AlexKoff88 AlexKoff88 converted this from a draft issue Feb 3, 2025
@ilya-lavrenov ilya-lavrenov added the good first issue Good for newcomers label Feb 3, 2025
@ilya-lavrenov ilya-lavrenov changed the title Create a GGUF reader [Good First Issue]: Create a GGUF reader Feb 3, 2025
@Geeks-Sid
Copy link

Can this be broken down into smaller exact tasks ? This would allow us to pick off tasks one by one and help contributors slowly build something instead of all of at once.

@AlexKoff88
Copy link
Collaborator Author

AlexKoff88 commented Feb 4, 2025

It can be for sure but the way I see it assumes that these tasks should be executed subsequently. For example:

  • One can start by enabling llama-3.2-1b in FP16.
  • Parsing and converting tokenizer from GGUF format to OpenVINO (tokenizer/detokenizer models). After that, we will have core functionality in place.
  • Then, a few tasks can be executed in parallel:
    • Enable Q8_0 llama
    • Enable Q4_0 and Q4_1 llama
    • Enable and verify other llama-based models such as Llama-3.1-8B, SmolLMs
    • Enable the most popular quantization schemes such as Q4_K_M
    • Enable Qwen model family

...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
Status: Contributors Needed
Development

No branches or pull requests

3 participants