This repository provides detailed instructions and steps to successfully run OpenWebUI and Ollama on Intel platforms.
- CPU: Intel® Core™ Ultra 7 processors
- GPU: Intel® Arc™ graphics
- RAM: 16GB
- DISK: 128GB
Install the latest Ubuntu* 22.04 LTS Desktop. Refer to Ubuntu Desktop installation tutorial if needed.
Docker and docker compose should be setup before running the commands below. Refer to here to setup docker.
- Refer to here to setup GPU drivers
Please ensure that you have these ports available before running the applications.
Apps | Port |
---|---|
Open WebUI | 80 |
You can offload model inference to specific device by modifying the environment variable setting in the docker-compose.yml
file.
Workload | Environment Variable | Supported Device |
---|---|---|
LLM | - | GPU |
STT | STT_DEVICE | CPU, GPU, NPU |
TTS | TTS_DEVICE | CPU |
Example Configuration:
- To offload the STT encoded workload to
NPU
, you can use the following configuration.
stt_service:
...
environment:
...
STT_DEVICE=NPU
...
docker compose build
export RENDER_GROUP_ID=$(getent group render | cut -d: -f3)
docker compose up -d
- Navigate to: http://localhost:80
- Open the Admin Panel from the top left corner.
- Click on
Settings
- Replace OpenAI API link:
- Click on
Connections
- Replace the OpenAI API link with
http://ollama:11434/v1
and provide any API Key. - Click on
Verify Connection
to ensure the server connection is verified. - Click
Save
button for save the changes
- Click on
- Replace TTS and STT API links:
- Click on
Audio
- For
Speech-to-Text Engine
, change fromwhisper (local)
toOpenAI
- Replace the OpenAI API link with
http://stt_service:5996/v1
and provide any API Key. Text-to-Speech Engine
, change fromWeb API
toOpenAI
- Replace the OpenAI API link with
http://tts_service:5995/v1
and provide any API Key. - Leave the STT Model, TTS Voice, and TTS Model fields empty (default TTS voice will be EN-US).
- Click Save to save the changes.
- Click on
-
Click on
New Chat
-
You may download the model from
Ollama.com
by entering the model name and selectingPull <model_name> from Ollama.com
. -
Click on
Arena Model
and select the target model (e.g., qwen2.5:latest).
-
LLM Model: Start interacting with the chat using the selected model.
-
TTS Pipeline: Click on the
Read Aloud
icon to trigger the TTS API. -
STT Pipeline: Click on the
Record Voice
icon to start recording, and click again to stop. The generated text will appear in the input field.
- Linux: Export the environment variable
OLLAMA_NUM_GPU
before starting the services to offload toCPU
device# Default: GPU export OLLAMA_NUM_GPU=999 # Runs on CPU export OLLAMA_NUM_GPU=0
Automatic speech recognition functionality is not supported in Firefox. Please use Chrome for validated performance.