LocalAI version:
LocalAI v4.1.3 (fdc9f7b)
Environment, CPU architecture, OS, and Version:
Linux localai 5.15.0-173-generic #183~20.04.1-Ubuntu SMP Fri Mar 13 14:51:03 UTC 2026 x86_64 x86_64 x86_64 GNU/Linux
Describe the bug
Description
I configured an Embedding Model in the Web system settings.
Following the official documentation, I downloaded and configured the following model:
Qwen3-Embedding-8B-Q4_K_M.ggufMy configuration is as follows:
YAML
backend: llama-cpp
embeddings: true
min_p: 0
name: granite-embedding-107m-multilingual
options:
- use_jinja: true
parameters:
min_p: 0
model: Qwen3-Embedding-8B-Q4_K_M.gguf
repeat_penalty: 1
temperature: 0.6
top_k: 20
top_p: 0.95
repeat_penalty: 1
temperature: 0.6
template:
use_tokenizer_template: true
top_k: 20
top_p: 0.95
When using Memory RAG to upload a PDF file, the POST /embedding request fails with the error:
model granite-embedding-107m-multilingual does not exist.
After checking the source code, it seems that LocalAI requires a default embedding model to be configured at container startup. Otherwise, it always falls back to granite-embedding-107m-multilingual.
Problem
I couldn't find any clear way to set a custom default embedding model (unlike Open WebUI, where you can easily specify both LLM and embedding model for RAG).
I want to use a custom LLM + custom embedding model for RAG, but it feels unnecessarily difficult in LocalAI.
Expected Behavior
LocalAI should allow flexible specification of the embedding model name — whether through the name field in the YAML config, environment variables, Web UI settings, or startup parameters — without being forced to use or fall back to granite-embedding-107m-multilingual.
Actual Behavior
Even after configuring a custom embedding model (including changing the name field), RAG functionality still tries to call the non-existent granite-embedding-107m-multilingual model, causing failure.
Logs
Additional context
Thanks , Great Product
LocalAI version:
LocalAI v4.1.3 (fdc9f7b)
Environment, CPU architecture, OS, and Version:
Linux localai 5.15.0-173-generic #183~20.04.1-Ubuntu SMP Fri Mar 13 14:51:03 UTC 2026 x86_64 x86_64 x86_64 GNU/Linux
Describe the bug
Description
I configured an Embedding Model in the Web system settings.
Following the official documentation, I downloaded and configured the following model:
Qwen3-Embedding-8B-Q4_K_M.ggufMy configuration is as follows:
YAML
When using Memory RAG to upload a PDF file, the POST /embedding request fails with the error:
model granite-embedding-107m-multilingual does not exist.
After checking the source code, it seems that LocalAI requires a default embedding model to be configured at container startup. Otherwise, it always falls back to granite-embedding-107m-multilingual.
Problem
I couldn't find any clear way to set a custom default embedding model (unlike Open WebUI, where you can easily specify both LLM and embedding model for RAG).
I want to use a custom LLM + custom embedding model for RAG, but it feels unnecessarily difficult in LocalAI.
Expected Behavior
LocalAI should allow flexible specification of the embedding model name — whether through the name field in the YAML config, environment variables, Web UI settings, or startup parameters — without being forced to use or fall back to granite-embedding-107m-multilingual.
Actual Behavior
Even after configuring a custom embedding model (including changing the name field), RAG functionality still tries to call the non-existent granite-embedding-107m-multilingual model, causing failure.
Logs
Additional context
Thanks , Great Product