LLMEngine

An API for chatting with multiple LLM models.

Usage

LLMEngine currently has support of

Ollama
GroqAPI

Using, Ollama you can run multiple LLM models on your devices. Some models supported using Ollama are

Llama3
phi3
gemma2b

Currently, mixtral8x7b is supported using GroqAPI.

To run this engine, you need to run this command

python main.py

There are API endpoints

1. Chat with LLM

You can chat with the supported LLMs using this API.

http://localhost:8000/api/v1/prompt

In the Body you can give prompt and model_type For example

{
    "prompt": "Fastest planet in the world",
    "model_type": "mixtral-8x7b-32768"
}

2. RAG with LLM

Yes, Retrieval Augmented Generation (RAG) is also supported in LLMEngine.

Hit this endpoint

http://localhost:8000/api/v1/rag

In the Body pass, model_type, query and file

TODO

Setup Support for more models
Shift llama3 support to GroqAPI as it requires a lot of compute.
Add SelfCorrectiveRAG support in the API.
Add support for different types of files
Add WebsiteLoader support

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
data		data
handlers		handlers
models		models
rag		rag
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLMEngine

Usage

1. Chat with LLM

2. RAG with LLM

TODO

About

Releases

Packages

Languages

Raghvender1205/LLMEngine

Folders and files

Latest commit

History

Repository files navigation

LLMEngine

Usage

1. Chat with LLM

2. RAG with LLM

TODO

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages