-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
failed to load model from /mnt/models/model.file
when trying to run granite model
#691
Comments
That initial hugging face link seems to be pointing to a full model directory and not a GGUF directly, try running a pointer to a quantized gguf or ollama:
|
Ah, this my own ignorance about running models. I was just clicking around on the IBM Granite docs and found my way to their model page (https://huggingface.co/ibm-granite/granite-3.1-8b-instruct). I thought it was in a format that could be run directly. Perhaps FWIW, using the suggestion of pointing to the GGUF directly worked successfully for me. |
I had this same experience today, both w/ this granite model and the new neuralmagic models. I could use some sort of explainer on what's expected to work -- perhaps out of scope for ramalama, but it'd help make working with AI more boring ;) |
@jasonbrooks Could you please share the neuralmagic models used so we can test as well? |
I think RamaLama should support said models, but I believe llama.cpp can not handle it, so we would need to change runtime to vLLM to handle it. |
Using a ThinkPad T14s Gen 2i with Fedora 41; installed v0.5.4 via
pip install ramalama
When trying to run a Granite model, I got an error
failed to load model from /mnt/models/model.file
. See details below.Repeated
run
with--debug
:See the details of the raw
podman run
command with debug logging:It looks like the source of the bind mount is a symbolic link:
Could that be tripping things up? Maybe?
If I swap out the symbolic link for the realpath, the
podman run
command is a bit more successful:The text was updated successfully, but these errors were encountered: