diff --git a/demos/audio/README.md b/demos/audio/README.md index 963ced134a..0329c14007 100644 --- a/demos/audio/README.md +++ b/demos/audio/README.md @@ -11,7 +11,7 @@ Check supported [Speech Recognition Models](https://openvinotoolkit.github.io/op **Model preparation**: Python 3.10 or higher with pip -**Model Server deployment**: Installed Docker Engine or OVMS binary package according to the [baremetal deployment guide](../../../docs/deploying_server_baremetal.md) +**Model Server deployment**: Installed Docker Engine or OVMS binary package according to the [baremetal deployment guide](../../docs/deploying_server_baremetal.md) **Client**: curl or Python for using OpenAI client package diff --git a/demos/integration_with_OpenWebUI/RAG-enabled_model_demo.png b/demos/integration_with_OpenWebUI/RAG-enabled_model_demo.png index b4f4a9aa32..8b85bfa8af 100644 Binary files a/demos/integration_with_OpenWebUI/RAG-enabled_model_demo.png and b/demos/integration_with_OpenWebUI/RAG-enabled_model_demo.png differ diff --git a/demos/integration_with_OpenWebUI/README.md b/demos/integration_with_OpenWebUI/README.md index 2ca04f5057..5bbad16a95 100644 --- a/demos/integration_with_OpenWebUI/README.md +++ b/demos/integration_with_OpenWebUI/README.md @@ -16,7 +16,7 @@ In this demo, OpenVINO Model Server is deployed on Linux with CPU using Docker a * [Docker Engine](https://docs.docker.com/engine/) installed * Host with x86_64 architecture -* Linux, macOS, or Windows via [WSL](https://learn.microsoft.com/en-us/windows/wsl/) +* Linux, macOS, or Windows * Python 3.11 with pip * HuggingFace account to download models @@ -24,39 +24,35 @@ There are other options to fulfill the prerequisites like [OpenVINO Model Server This demo was tested on CPU but most of the models could be also run on Intel accelerators like GPU and NPU. -### Step 1: Preparation - -Download the export script, install its dependencies and create the directory for models: - -```bash -curl https://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/main/demos/common/export_models/export_model.py -o export_model.py -pip install -r https://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/main/demos/common/export_models/requirements.txt +## Step 1: Pull model and start the OVMS sever +::::{tab-set} +:::{tab-item} Windows +:sync: Windows +```bat mkdir models +ovms.exe --pull --source_model Godreign/llama-3.2-3b-instruct-openvino-int4-model --model_repository_path models --task text_generation +ovms.exe --add_to_config --config_path models\config.json --model_path Godreign\llama-3.2-3b-instruct-openvino-int4-model --model_name Godreign/llama-3.2-3b-instruct-openvino-int4-model +ovms.exe --rest_port 8000 --config_path models\config.json ``` - -### Step 2: Export Model - -The text generation model used in this demo is [meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct). If the model is not downloaded before, access must be requested. Run the export script to download and quantize the model: - -```bash -python export_model.py text_generation --source_model meta-llama/Llama-3.2-1B-Instruct --weight-format int8 --kv_cache_precision u8 --config_file_path models/config.json -``` - -### Step 3: Server Deployment - -Deploy with docker: - +::: +:::{tab-item} Linux (using Docker) +:sync: Linux ```bash -docker run -d -p 8000:8000 -v $(pwd)/models:/workspace:ro openvino/model_server --rest_port 8000 --config_path /workspace/config.json +mkdir models +docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model Godreign/llama-3.2-3b-instruct-openvino-int4-model --model_repository_path /models --task text_generation +docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --add_to_config --config_path /models/config.json --model_path Godreign/llama-3.2-3b-instruct-openvino-int4-model --model_name Godreign/llama-3.2-3b-instruct-openvino-int4-model +docker run -d -u $(id -u):$(id -g) -v $PWD/models:/models -p 8000:8000 openvino/model_server:weekly --rest_port 8000 --config_path /models/config.json ``` +::: +:::: Here is the basic call to check if it works: -```bash -curl http://localhost:8000/v3/chat/completions -H "Content-Type: application/json" -d "{\"model\":\"meta-llama/Llama-3.2-1B-Instruct\",\"messages\":[{\"role\":\"system\",\"content\":\"You are a helpful assistant.\"},{\"role\":\"user\",\"content\":\"Say this is a test\"}]}" +```console +curl http://localhost:8000/v3/chat/completions -H "Content-Type: application/json" -d "{\"model\":\"Godreign/llama-3.2-3b-instruct-openvino-int4-model\",\"messages\":[{\"role\":\"system\",\"content\":\"You are a helpful assistant.\"},{\"role\":\"user\",\"content\":\"Say this is a test\"}]}" ``` -### Step 4: Start Open WebUI +## Step 2: Install and start OpenWebUI Install Open WebUI: @@ -66,7 +62,7 @@ pip install --no-cache-dir open-webui Running Open WebUI: -```bash +```console open-webui serve ``` @@ -88,7 +84,7 @@ Go to [http://localhost:8080](http://localhost:8080) and create admin account to 1. Go to **Admin Panel** → **Settings** → **Connections** ([http://localhost:8080/admin/settings/connections](http://localhost:8080/admin/settings/connections)) 2. Click **+Add Connection** under **OpenAI API** * URL: `http://localhost:8000/v3` - * Model IDs: put `meta-llama/Llama-3.2-1B-Instruct` and click **+** to add the model, or leave empty to include all models + * Model IDs: put `Godreign/llama-3.2-3b-instruct-openvino-int4-model` and click **+** to add the model, or leave empty to include all models 3. Click **Save** ![connection setting](./connection_setting.png) @@ -98,6 +94,18 @@ Click **New Chat** and select the model to start chatting ![chat demo](./chat_demo.png) +### (optional) Step 3: Set request parameters + +There are multiple configurable parameters in OVMS, all of them for `/v3/chat/completions` endpoint are accessible in [chat api documentation](https://github.com/openvinotoolkit/model_server/blob/main/docs/model_server_rest_api_chat.md#request). + +To configure them in *OpenWebUI* with an example of turning off reasoning: +1. Go to **Admin Panel** -> **Settings** -> **Models** ([http://localhost:8080/admin/settings/models](http://localhost:8080/admin/settings/models)) +2. Click on desired model, unfold **Advanced Params**. +3. Click **+ Add Custom Parameter**. +4. Change parameter name to `chat_template_kwargs` and content to `{"enable_thinking": false}`. + +![parameter set](./set_chat_template_parameter.png) + ### Reference [https://docs.openwebui.com/getting-started/quick-start/starting-with-openai-compatible](https://docs.openwebui.com/getting-started/quick-start/starting-with-openai-compatible/#step-2-connect-your-server-to-open-webui) @@ -107,16 +115,33 @@ Click **New Chat** and select the model to start chatting ### Step 1: Model Preparation -In addition to text generation, endpoints for embedding and reranking in Retrieval Augmented Generation can also be deployed with OpenVINO Model Server. In this demo, the embedding model is [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) and the the reranking model is [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base). Run the export script to download and quantize the models: +In addition to text generation, endpoints for embedding and reranking in Retrieval Augmented Generation can also be deployed with OpenVINO Model Server. In this demo, the embedding model is [OpenVINO/Qwen3-Embedding-0.6B-fp16-ov](https://huggingface.co/OpenVINO/Qwen3-Embedding-0.6B-fp16-ov) and the the reranking model is [OpenVINO/Qwen3-Reranker-0.6B-seq-cls-fp16-ov](https://huggingface.co/OpenVINO/Qwen3-Reranker-0.6B-seq-cls-fp16-ov). Run the export script to download and quantize the models: + +::::{tab-set} +:::{tab-item} Windows +:sync: Windows +```bat +ovms.exe --pull --source_model OpenVINO/Qwen3-Embedding-0.6B-fp16-ov --model_repository_path models --task embeddings +ovms.exe --add_to_config --config_path models\config.json --model_path OpenVINO\Qwen3-Embedding-0.6B-fp16-ov --model_name OpenVINO/Qwen3-Embedding-0.6B-fp16-ov +ovms.exe --pull --source_model OpenVINO/Qwen3-Reranker-0.6B-seq-cls-fp16-ov --model_repository_path models --task rerank +ovms.exe --add_to_config --config_path models\config.json --model_path OpenVINO\Qwen3-Reranker-0.6B-seq-cls-fp16-ov --model_name OpenVINO/Qwen3-Reranker-0.6B-seq-cls-fp16-ov +``` +::: +:::{tab-item} Linux (using Docker) +:sync: Linux ```bash -python export_model.py embeddings_ov --source_model sentence-transformers/all-MiniLM-L6-v2 --weight-format int8 --config_file_path models/config.json -python export_model.py rerank_ov --source_model BAAI/bge-reranker-base --weight-format int8 --config_file_path models/config.json +docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model OpenVINO/Qwen3-Embedding-0.6B-fp16-ov --model_repository_path models --task embeddings +docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --add_to_config --config_path /models/config.json --model_path OpenVINO/Qwen3-Embedding-0.6B-fp16-ov --model_name OpenVINO/Qwen3-Embedding-0.6B-fp16-ov +docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model OpenVINO/Qwen3-Reranker-0.6B-seq-cls-fp16-ov --model_repository_path models --task rerank +docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --add_to_config --config_path /models/config.json --model_path OpenVINO/Qwen3-Reranker-0.6B-seq-cls-fp16-ov --model_name OpenVINO/Qwen3-Reranker-0.6B-seq-cls-fp16-ov ``` +::: +:::: Keep the model server running or restart it. Here are the basic calls to check if they work: -```bash -curl http://localhost:8000/v3/embeddings -H "Content-Type: application/json" -d "{\"model\":\"sentence-transformers/all-MiniLM-L6-v2\",\"input\":\"hello world\"}" -curl http://localhost:8000/v3/rerank -H "Content-Type: application/json" -d "{\"model\":\"BAAI/bge-reranker-base\",\"query\":\"welcome\",\"documents\":[\"good morning\",\"farewell\"]}" +```console +curl http://localhost:8000/v3/embeddings -H "Content-Type: application/json" -d "{\"model\":\"OpenVINO/Qwen3-Embedding-0.6B-fp16-ov\",\"input\":\"hello world\"}" +curl http://localhost:8000/v3/rerank -H "Content-Type: application/json" -d "{\"model\":\"OpenVINO/Qwen3-Reranker-0.6B-seq-cls-fp16-ov\",\"query\":\"welcome\",\"documents\":[\"good morning\",\"farewell\"]}" ``` ### Step 2: Documents Setting @@ -124,12 +149,14 @@ curl http://localhost:8000/v3/rerank -H "Content-Type: application/json" -d "{\" 1. Go to **Admin Panel** → **Settings** → **Documents** ([http://localhost:8080/admin/settings/documents](http://localhost:8080/admin/settings/documents)) 2. Select **OpenAI** for **Embedding Model Engine** * URL: `http://localhost:8000/v3` - * Embedding Model: `sentence-transformers/all-MiniLM-L6-v2` + * Set Engine type to `OpenAI` + * Embedding Model: `OpenVINO/Qwen3-Embedding-0.6B-fp16-ov` * Put anything in API key 3. Enable **Hybrid Search** 4. Select **External** for **Reranking Engine** * URL: `http://localhost:8000/v3/rerank` - * Reranking Model: `BAAI/bge-reranker-base` + * Set Engine type to `External` + * Reranking Model: `OpenVINO/Qwen3-Reranker-0.6B-seq-cls-fp16-ov` 5. Click **Save** ![embedding and retrieval setting](./embedding_and_retrieval_setting.png) @@ -140,7 +167,7 @@ curl http://localhost:8000/v3/rerank -H "Content-Type: application/json" -d "{\" The documentation used in this demo is [https://github.com/open-webui/docs/archive/refs/heads/main.zip](https://github.com/open-webui/docs/archive/refs/heads/main.zip). Download and extract it to get the folder. -2. Go to **Workspace** → **Knowledge** → **+Create a Knowledge Base** ([http://localhost:8080/workspace/knowledge/create](http://localhost:8080/workspace/knowledge/create)) +2. Go to **Workspace** → **Knowledge** → **+ New Knowledge** ([http://localhost:8080/workspace/knowledge/create](http://localhost:8080/workspace/knowledge/create)) 3. Name and describe the knowledge base 4. Click **Create Knowledge** 5. Click **+Add Content** → **Upload directory**, then select the extracted folder. This will upload all files with suitable extensions. @@ -160,7 +187,7 @@ curl http://localhost:8000/v3/rerank -H "Content-Type: application/json" -d "{\" ### Step 5: RAG-enabled Model -1. Go to **Workspace** → **Models** → **+Add New Model** ([http://localhost:8080/workspace/models/create](http://localhost:8080/workspace/models/create)) +1. Go to **Workspace** → **Models** → **+ New Model** ([http://localhost:8080/workspace/models/create](http://localhost:8080/workspace/models/create)) 2. Configure the Model: * Name the model * Select a base model from the list @@ -185,16 +212,29 @@ curl http://localhost:8000/v3/rerank -H "Content-Type: application/json" -d "{\" ### Step 1: Model Preparation -The image generation model used in this demo is [dreamlike-art/dreamlike-anime-1.0](https://huggingface.co/dreamlike-art/dreamlike-anime-1.0). Run the export script to download and quantize the model: +The image generation model used in this demo is [OpenVINO/FLUX.1-schnell-int4-ov](https://huggingface.co/OpenVINO/FLUX.1-schnell-int4-ov). Run the ovms with --pull parameter to download and quantize the model: +::::{tab-set} +:::{tab-item} Windows +:sync: Windows +```bat +ovms.exe --pull --source_model OpenVINO/FLUX.1-schnell-int4-ov --model_repository_path models --model_name OpenVINO/FLUX.1-schnell-int4-ov --task image_generation --default_num_inference_steps 3 +ovms.exe --add_to_config --config_path models\config.json --model_path OpenVINO\FLUX.1-schnell-int4-ov --model_name OpenVINO/FLUX.1-schnell-int4-ov +``` +::: +:::{tab-item} Linux (using Docker) +:sync: Linux ```bash -python export_model.py image_generation --source_model dreamlike-art/dreamlike-anime-1.0 --weight-format int8 --config_file_path models/config.json +docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model OpenVINO/FLUX.1-schnell-int4-ov --model_repository_path models --model_name OpenVINO/FLUX.1-schnell-int4-ov --task image_generation --default_num_inference_steps 3 +docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --add_to_config --config_path /models/config.json --model_path OpenVINO/FLUX.1-schnell-int4-ov --model_name OpenVINO/FLUX.1-schnell-int4-ov ``` +::: +:::: Keep the model server running or restart it. Here is the basic call to check if it works: -```bash -curl http://localhost:8000/v3/images/generations -H "Content-Type: application/json" -d "{\"model\":\"dreamlike-art/dreamlike-anime-1.0\",\"prompt\":\"anime\",\"num_inference_steps\":1,\"size\":\"256x256\",\"response_format\":\"b64_json\"}" +```console +curl http://localhost:8000/v3/images/generations -H "Content-Type: application/json" -d "{\"model\":\"OpenVINO/FLUX.1-schnell-int4-ov\",\"prompt\":\"anime\",\"num_inference_steps\":1,\"size\":\"256x256\",\"response_format\":\"b64_json\"}" ``` ### Step 2: Image Generation Setting @@ -204,7 +244,7 @@ curl http://localhost:8000/v3/images/generations -H "Content-Type: application/j * URL: `http://localhost:8000/v3` * Put anything in API key 3. Enable **Image Generation (Experimental)** - * Set Default Model: `dreamlike-art/dreamlike-anime-1.0` + * Set Default Model: `OpenVINO/FLUX.1-schnell-int4-ov` * Set Image Size. Must be in WxH format, example: `256x256` 4. Click **Save** @@ -213,8 +253,9 @@ curl http://localhost:8000/v3/images/generations -H "Content-Type: application/j ### Step 3: Generate Image Method 1: -1. Toggle the **Image** switch to on -2. Enter a query and send +1. Expand `Integrations` menu +2. Toggle the **Image** switch to on +3. Enter a query and send ![image generation method 1 demo](./image_generation_method_1_demo.png) @@ -228,28 +269,41 @@ Method 2: ### Reference [https://docs.openvino.ai/2025/model-server/ovms_demos_image_generation.html](https://docs.openvino.ai/2025/model-server/ovms_demos_image_generation.html#export-model-for-cpu) -[https://docs.openwebui.com/tutorials/images](https://docs.openwebui.com/tutorials/images/#using-image-generation) +[https://docs.openwebui.com/features/image-generation-and-editing](https://docs.openwebui.com/features/image-generation-and-editing/openai) --- ## VLM ### Step 1: Model Preparation -The vision language model used in this demo is [OpenGVLab/InternVL2-2B](https://huggingface.co/OpenGVLab/InternVL2-2B). Run the export script to download and quantize the model: +The vision language model used in this demo is [OpenVINO/InternVL2-2B-int4-ov](https://huggingface.co/OpenVINO/InternVL2-2B-int4-ov). Run the ovms with --pull parameter to download and quantize the model: +::::{tab-set} +:::{tab-item} Windows +:sync: Windows +```bat +ovms.exe --pull --source_model OpenVINO/InternVL2-2B-int4-ov --model_repository_path models --model_name OpenVINO/InternVL2-2B-int4-ov --task text_generation +ovms.exe --add_to_config --config_path models\config.json --model_path OpenVINO\InternVL2-2B-int4-ov --model_name OpenVINO/InternVL2-2B-int4-ov +``` +::: +:::{tab-item} Linux (using Docker) +:sync: Linux ```bash -python export_model.py text_generation --source_model OpenGVLab/InternVL2-2B --weight-format int4 --pipeline_type VLM --model_name OpenGVLab/InternVL2-2B --config_file_path models/config.json +docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model OpenVINO/InternVL2-2B-int4-ov --model_repository_path models --model_name OpenVINO/InternVL2-2B-int4-ov --task text_generation +docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --add_to_config --config_path /models/config.json --model_path OpenVINO/InternVL2-2B-int4-ov --model_name OpenVINO/InternVL2-2B-int4-ov ``` +::: +:::: Keep the model server running or restart it. Here is the basic call to check if it works: -```bash -curl http://localhost:8000/v3/chat/completions -H "Content-Type: application/json" -d "{ \"model\": \"OpenGVLab/InternVL2-2B\", \"messages\":[{\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"what is in the picture?\"},{\"type\": \"image_url\", \"image_url\": {\"url\": \"http://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/releases/2025/3/demos/common/static/images/zebra.jpeg\"}}]}], \"max_completion_tokens\": 100}" +```console +curl http://localhost:8000/v3/chat/completions -H "Content-Type: application/json" -d "{ \"model\": \"OpenVINO/InternVL2-2B-int4-ov\", \"messages\":[{\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"what is in the picture?\"},{\"type\": \"image_url\", \"image_url\": {\"url\": \"http://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/releases/2025/3/demos/common/static/images/zebra.jpeg\"}}]}], \"max_completion_tokens\": 100}" ``` ### Step 2: Chat with VLM -1. Start a **New Chat** with model set to `OpenGVLab/InternVL2-2B` +1. Start a **New Chat** with model set to `OpenVINO/InternVL2-2B-int4-ov` 2. Click **+More** to upload images, by capturing the screen or uploading files. The image used in this demo is [http://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/releases/2025/3/demos/common/static/images/zebra.jpeg](http://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/releases/2025/3/demos/common/static/images/zebra.jpeg). ![upload images](./upload_images.png) @@ -266,20 +320,19 @@ curl http://localhost:8000/v3/chat/completions -H "Content-Type: application/js ### Step 1: Start Tool Server -Start a OpenAPI tool server available in the [openapi-servers repo](https://github.com/open-webui/openapi-servers). The server used in this demo is [https://github.com/open-webui/openapi-servers/tree/main/servers/time](https://github.com/open-webui/openapi-servers/tree/main/servers/time). Run it locally at `http://localhost:18000`: +Start a OpenAPI tool server available in the [openapi-servers repo](https://github.com/open-webui/openapi-servers). The server used in this demo is [https://github.com/open-webui/openapi-servers/tree/main/servers/weather](https://github.com/open-webui/openapi-servers/tree/main/servers/weather). Run it locally at `http://localhost:9000`: -```bash -git clone https://github.com/open-webui/openapi-servers -cd openapi-servers/servers/time -pip install -r requirements.txt -uvicorn main:app --host 0.0.0.0 --port 18000 --reload +```console +pip install mcpo +pip install mcp_weather_server +mcpo --port 9000 -- python -m mcp_weather_server ``` ### Step 2: Tools Setting -1. Go to **Admin Panel** → **Settings** → **Tools** ([http://localhost:8080/admin/settings/tools](http://localhost:8080/admin/settings/tools)) +1. Go to **Admin Panel** → **Settings** → **External Tools** 2. Click **+Add Connection** - * URL: `http://localhost:18000` + * URL: `http://localhost:9000` * Name the tool 3. Click **Save** @@ -296,4 +349,60 @@ uvicorn main:app --host 0.0.0.0 --port 18000 --reload ![chat with AI Agent demo](./chat_with_AI_Agent_demo.png) ### Reference -[https://docs.openwebui.com/openapi-servers/open-webui](https://docs.openwebui.com/openapi-servers/open-webui/#step-2-connect-tool-server-in-open-webui) +[https://docs.openwebui.com/features/plugin/tools/openapi-servers/open-webui](https://docs.openwebui.com/features/plugin/tools/openapi-servers/open-webui#step-2-connect-tool-server-in-open-webui) + + +## Audio + +> **Note:** To ensure audio features work correctly, download [FFmpeg](https://ffmpeg.org/download.html) and add its executable directory to your system's `PATH` environment variable. + +### Step 1: Models Preparation + +Start by downloading `export_models.py` script and run it to download and quantize the model for speech generation: +```console +curl https://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/main/demos/common/export_models/export_model.py -o export_model.py +pip3 install -r https://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/main/demos/common/export_models/requirements.txt +python export_model.py text2speech --source_model microsoft/speecht5_tts --weight-format fp32 --model_name microsoft/speecht5_tts --config_file_path models/config.json --model_repository_path models --vocoder microsoft/speecht5_hifigan +``` + +Next, download and add to config model for transcription: + +::::{tab-set} +:::{tab-item} Windows +:sync: Windows +```bat +ovms.exe --pull --source_model OpenVINO/whisper-base-fp16-ov --model_repository_path models --task speech2text +ovms.exe --add_to_config --config_path models\config.json --model_path OpenVINO\whisper-base-fp16-ov --model_name OpenVINO/whisper-base-fp16-ov +``` +::: +:::{tab-item} Linux (using Docker) +:sync: Linux +```bash +docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model OpenVINO/whisper-base-fp16-ov --model_repository_path /models --task speech2text +docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --add_to_config --config_path /models/config.json --model_path OpenVINO/whisper-base-fp16-ov --model_name OpenVINO/whisper-base-fp16-ov +``` +::: +:::: + +### Step 2: Audio Settings + +1. Go to **Admin Panel** → **Settings** → **Audio** +2. Select **OpenAI** for both engines + * URL: `http://localhost:8000/v3` + * Set Engine type to `OpenAI` + * STT Model: `OpenVINO/whisper-base-fp16-ov` + * TTS Model: `microsoft/speecht5_tts` + * Put anything in API key +3. Click **Save** + +![audio settings](./audio_configuration.png) + +### Step 3: Chat with AI Agent + +1. Click **Voice mode** icon. +2. Start talking. + +![voice mode](./voice_mode.png) + +### Reference +[https://docs.openwebui.com/features/#%EF%B8%8F-audio-voice--accessibility](https://docs.openwebui.com/features/#%EF%B8%8F-audio-voice--accessibility) \ No newline at end of file diff --git a/demos/integration_with_OpenWebUI/activate_the_tool.png b/demos/integration_with_OpenWebUI/activate_the_tool.png index 34eb55f9b3..a9f0ddaf3a 100644 Binary files a/demos/integration_with_OpenWebUI/activate_the_tool.png and b/demos/integration_with_OpenWebUI/activate_the_tool.png differ diff --git a/demos/integration_with_OpenWebUI/audio_configuration.png b/demos/integration_with_OpenWebUI/audio_configuration.png new file mode 100644 index 0000000000..6a9b18b670 Binary files /dev/null and b/demos/integration_with_OpenWebUI/audio_configuration.png differ diff --git a/demos/integration_with_OpenWebUI/chat_demo.png b/demos/integration_with_OpenWebUI/chat_demo.png index 77daac229a..5a2278d839 100644 Binary files a/demos/integration_with_OpenWebUI/chat_demo.png and b/demos/integration_with_OpenWebUI/chat_demo.png differ diff --git a/demos/integration_with_OpenWebUI/chat_with_AI_Agent_demo.png b/demos/integration_with_OpenWebUI/chat_with_AI_Agent_demo.png index 4022916ac5..ca8215be7c 100644 Binary files a/demos/integration_with_OpenWebUI/chat_with_AI_Agent_demo.png and b/demos/integration_with_OpenWebUI/chat_with_AI_Agent_demo.png differ diff --git a/demos/integration_with_OpenWebUI/chat_with_RAG_demo.png b/demos/integration_with_OpenWebUI/chat_with_RAG_demo.png index 83e09fa8e7..302b866f6e 100644 Binary files a/demos/integration_with_OpenWebUI/chat_with_RAG_demo.png and b/demos/integration_with_OpenWebUI/chat_with_RAG_demo.png differ diff --git a/demos/integration_with_OpenWebUI/chat_with_VLM_demo.png b/demos/integration_with_OpenWebUI/chat_with_VLM_demo.png index 2d8cc0f9c9..267f716d6c 100644 Binary files a/demos/integration_with_OpenWebUI/chat_with_VLM_demo.png and b/demos/integration_with_OpenWebUI/chat_with_VLM_demo.png differ diff --git a/demos/integration_with_OpenWebUI/connection_setting.png b/demos/integration_with_OpenWebUI/connection_setting.png index f05114510a..11715abeb3 100644 Binary files a/demos/integration_with_OpenWebUI/connection_setting.png and b/demos/integration_with_OpenWebUI/connection_setting.png differ diff --git a/demos/integration_with_OpenWebUI/create_and_configure_the_RAG-enabled_model.png b/demos/integration_with_OpenWebUI/create_and_configure_the_RAG-enabled_model.png index 3514c9f6bc..f1efd6ce6e 100644 Binary files a/demos/integration_with_OpenWebUI/create_and_configure_the_RAG-enabled_model.png and b/demos/integration_with_OpenWebUI/create_and_configure_the_RAG-enabled_model.png differ diff --git a/demos/integration_with_OpenWebUI/embedding_and_retrieval_setting.png b/demos/integration_with_OpenWebUI/embedding_and_retrieval_setting.png index 126d46bbdf..936ec6fb41 100644 Binary files a/demos/integration_with_OpenWebUI/embedding_and_retrieval_setting.png and b/demos/integration_with_OpenWebUI/embedding_and_retrieval_setting.png differ diff --git a/demos/integration_with_OpenWebUI/image_generation_method_1_demo.png b/demos/integration_with_OpenWebUI/image_generation_method_1_demo.png index 4d085e6658..93271fc347 100644 Binary files a/demos/integration_with_OpenWebUI/image_generation_method_1_demo.png and b/demos/integration_with_OpenWebUI/image_generation_method_1_demo.png differ diff --git a/demos/integration_with_OpenWebUI/image_generation_method_2_demo.png b/demos/integration_with_OpenWebUI/image_generation_method_2_demo.png index b000b5c60e..fbefbd4ea6 100644 Binary files a/demos/integration_with_OpenWebUI/image_generation_method_2_demo.png and b/demos/integration_with_OpenWebUI/image_generation_method_2_demo.png differ diff --git a/demos/integration_with_OpenWebUI/image_generation_setting.png b/demos/integration_with_OpenWebUI/image_generation_setting.png index 1e8a6204bf..1fa734e3cd 100644 Binary files a/demos/integration_with_OpenWebUI/image_generation_setting.png and b/demos/integration_with_OpenWebUI/image_generation_setting.png differ diff --git a/demos/integration_with_OpenWebUI/select_documents.png b/demos/integration_with_OpenWebUI/select_documents.png index b597de310b..060ff1a0a1 100644 Binary files a/demos/integration_with_OpenWebUI/select_documents.png and b/demos/integration_with_OpenWebUI/select_documents.png differ diff --git a/demos/integration_with_OpenWebUI/set_chat_template_parameter.png b/demos/integration_with_OpenWebUI/set_chat_template_parameter.png new file mode 100644 index 0000000000..021947957b Binary files /dev/null and b/demos/integration_with_OpenWebUI/set_chat_template_parameter.png differ diff --git a/demos/integration_with_OpenWebUI/tools_setting.png b/demos/integration_with_OpenWebUI/tools_setting.png index 17a28db47f..3ca1853951 100644 Binary files a/demos/integration_with_OpenWebUI/tools_setting.png and b/demos/integration_with_OpenWebUI/tools_setting.png differ diff --git a/demos/integration_with_OpenWebUI/upload_images.png b/demos/integration_with_OpenWebUI/upload_images.png index 0c6958feef..2b55c16add 100644 Binary files a/demos/integration_with_OpenWebUI/upload_images.png and b/demos/integration_with_OpenWebUI/upload_images.png differ diff --git a/demos/integration_with_OpenWebUI/voice_mode.png b/demos/integration_with_OpenWebUI/voice_mode.png new file mode 100644 index 0000000000..878e52ab8c Binary files /dev/null and b/demos/integration_with_OpenWebUI/voice_mode.png differ