Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Configuration of CPU+GPU offload #355

Open
Aisuko opened this issue Aug 9, 2024 · 0 comments
Open

[Feature]: Configuration of CPU+GPU offload #355

Aisuko opened this issue Aug 9, 2024 · 0 comments

Comments

@Aisuko
Copy link
Contributor

Aisuko commented Aug 9, 2024

Contact Details(optional)

No response

What feature are you requesting?

Currently, we didn't compile llamacpp with cuda accelerate. If we want to support use offload feature, we need to compile llamacpp with gpu label.

https://github.com/SkywardAI/llama.cpp/blob/a59f8fdc85e1119d470d8766e29617962549d993/examples/main/README.md?plain=1#L72

how many layer you want to run your model on GPU?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant