Skip to content

Commit

Permalink
docs: add examples of vLLM and Xinference models (#522)
Browse files Browse the repository at this point in the history
Close #205 , #392 , and #448 .

- The LLMs and embedding models served by vLLM and Xinference have been
tested to be able to work properly through the openai_like option. (#205
and #448 )
- Bge-m3 served by Ollama can also work properly. (#392 )
  • Loading branch information
jrj5423 authored Dec 20, 2024
1 parent 262037d commit f70db97
Show file tree
Hide file tree
Showing 2 changed files with 35 additions and 2 deletions.
23 changes: 21 additions & 2 deletions frontend/app/src/pages/docs/embedding-model.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,13 @@ OpenAI provides a variety of Embedding Models, we recommend using the OpenAI `te


For more information, see the [OpenAI Embedding Models documentation](https://platform.openai.com/docs/guides/embeddings#embedding-models).

### OpenAI-Like

Autoflow also supports embedding model providers (such as [ZhipuAI](#zhipuai)) that conform to the OpenAI API specification.

You can also use models deployed on local AI model platforms (such as [vLLM](#vllm) and [Xinference](https://inference.readthedocs.io/en/latest/index.html)) that conform to the OpenAI API specification in Autoflow.

To use OpenAI-Like embedding model providers, you need to provide the **base URL** of the embedding API as the following JSON format in **Advanced Settings**:

```json
Expand All @@ -53,7 +56,7 @@ You need to set up the base URL in the **Advanced Settings** as follows:

```json
{
"api_base": "https://open.bigmodel.cn/api/paas/v4"
"api_base": "https://open.bigmodel.cn/api/paas/v4/"
}
```

Expand All @@ -65,6 +68,22 @@ You need to set up the base URL in the **Advanced Settings** as follows:

For more information, see the [ZhipuAI embedding models documentation](https://open.bigmodel.cn/dev/api/vector/embedding-3).

#### vLLM

When serving locally, the default embedding API endpoint for vLLM is:

`http://localhost:8000/v1/embeddings`

You need to set up the base URL in the **Advanced Settings** as follows:

```json
{
"api_base": "http://localhost:8000/v1/"
}
```

For more information, see the [vLLM documentation](https://docs.vllm.ai/en/stable/).

### JinaAI

JinaAI provides multimodal multilingual long-context Embedding Models for RAG applications.
Expand Down Expand Up @@ -99,6 +118,7 @@ Ollama is a lightweight framework for building and running large language models
| Embedding Model | Vector Dimensions | Max Tokens |
| ------------------ | ----------------- | ---------- |
| `nomic-embed-text` | 768 | 8192 |
| `bge-m3` | 1024 | 8192 |

To use Ollama, you'll need to configure the API base URL in the **Advanced Settings**:

Expand Down Expand Up @@ -127,4 +147,3 @@ To configure the Local Embedding Service, set the API URL in the **Advanced Sett
"api_url": "http://local-embedding-reranker:5001/api/v1/embedding"
}
```

14 changes: 14 additions & 0 deletions frontend/app/src/pages/docs/llm.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -53,3 +53,17 @@ Currently Autoflow supports the following LLM providers:
"api_base": "http://localhost:11434"
}
```
- [vLLM](https://docs.vllm.ai/en/stable/)
- Default config:
```json
{
"api_base": "http://localhost:8000/v1/"
}
```
- [Xinference](https://inference.readthedocs.io/en/latest/index.html)
- Default config:
```json
{
"api_base": "http://localhost:9997/v1/"
}
```

0 comments on commit f70db97

Please sign in to comment.