Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for new models #1393

Closed
wants to merge 33 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
b8cbea3
Add support for meta-llama-3.1 8B & 8B Instruct
Smartappli Jul 27, 2024
61cfac2
Add support for Meta-Llama-3.1 70B & 70B Instruct
Smartappli Jul 27, 2024
95c22a3
Update README.md
Smartappli Jul 27, 2024
cb2c787
Update llama-cpp-python to version 0.2.84
Smartappli Jul 28, 2024
373efda
Add support for Mistral-Nemo-Instruct
Smartappli Jul 28, 2024
d7bc34a
Add support for Gemma-2 9B Instruct
Smartappli Jul 28, 2024
29761d0
Update README.md
Smartappli Jul 28, 2024
df8bf63
Add support for Codestral 22B
Smartappli Jul 28, 2024
f9597f5
Update README.md
Smartappli Jul 28, 2024
db33c3e
revert
Smartappli Jul 28, 2024
d306f1c
revert
Smartappli Jul 28, 2024
b2301a2
Update models.json
Smartappli Jul 28, 2024
7d2dc45
Add support for Gemma-2-27B-Instruct
Smartappli Jul 28, 2024
c678341
Update README.md
Smartappli Jul 28, 2024
0a1faf7
Add support for phi 3.1 mini 4k and phi 3.1 mini 128k
Smartappli Jul 28, 2024
4fbf919
Update README.md
Smartappli Jul 28, 2024
c508ac6
Add support for Mathstral 7B
Smartappli Jul 28, 2024
2a7d2ab
Update README.md
Smartappli Jul 28, 2024
0d13235
Add support for Falcon 11B
Smartappli Jul 28, 2024
2801b56
Update README.md
Smartappli Jul 28, 2024
6d969b9
Add support for Open Chat 3.6 - 8B
Smartappli Jul 28, 2024
f5499b3
Update README.md
Smartappli Jul 28, 2024
4f82b72
Add support for Med42-v2-8B
Smartappli Jul 28, 2024
d90e931
Update README.md
Smartappli Jul 28, 2024
5ebd862
Update medicine LLM 13B
Smartappli Jul 28, 2024
19b77af
Add support for Gemma-2 9B and Gemma-2 27B
Smartappli Jul 28, 2024
b9a623e
Update README.md
Smartappli Jul 28, 2024
573c4ff
bugfix
Smartappli Jul 28, 2024
0ebb75b
Update mistral 7B to v0.3
Smartappli Jul 28, 2024
61d77f2
Update README.md
Smartappli Jul 28, 2024
147946f
Merge branch 'main' into july24
Smartappli Jul 28, 2024
cf5fdfc
bugfix
Smartappli Jul 30, 2024
151c5d7
Update serge.env
gaby Aug 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 10 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,9 @@ The following Environment Variables are available:
| Variable Name | Description | Default Value |
|-----------------------|---------------------------------------------------------|--------------------------------------|
| `SERGE_DATABASE_URL` | Database connection string | `sqlite:////data/db/sql_app.db` |
| `SERGE_JWT_SECRET` | Key for auth token encryption. Use a random string | `uF7FGN5uzfGdFiPzR` |
| `SERGE_JWT_SECRET` | Key for auth token encryption. Use a random string | `uF7FGN5uzfGdFiPzR` |
| `SERGE_SESSION_EXPIRY`| Duration in minutes before a user must reauthenticate | `60` |
| `NODE_ENV` | Node.js running environment | `production` |
| `NODE_ENV` | Node.js running environment | `production` |

## 🖥️ Windows

Expand All @@ -73,30 +73,30 @@ Instructions for setting up Serge on Kubernetes can be found in the [wiki](https
| **Code** | 13B, 33B |
| **CodeLLaMA** | 7B, 7B-Instruct, 7B-Python, 13B, 13B-Instruct, 13B-Python, 34B, 34B-Instruct, 34B-Python |
| **Codestral** | 22B v0.1 |
| **Gemma** | 2B, 1.1-2B-Instruct, 7B, 1.1-7B-Instruct |
| **Gemma** | 2B, 1.1-2B-Instruct, 7B, 1.1-7B-Instruct, 2-9B, 2-9B-Instruct, 2-27B, 2-27B-Instruct |
| **Gorilla** | Falcon-7B-HF-v0, 7B-HF-v1, Openfunctions-v1, Openfunctions-v2 |
| **Falcon** | 7B, 7B-Instruct, 40B, 40B-Instruct |
| **Falcon** | 7B, 7B-Instruct, 11B, 40B, 40B-Instruct |
| **LLaMA 2** | 7B, 7B-Chat, 7B-Coder, 13B, 13B-Chat, 70B, 70B-Chat, 70B-OASST |
| **LLaMA 3** | 11B-Instruct, 13B-Instruct, 16B-Instruct |
| **LLaMA Pro** | 8B, 8B-Instruct |
| **Med42** | 70B |
| **Mathstral** | 7B |
| **Med42** | 70B, v2-8B |
| **Medalpaca** | 13B |
| **Medicine** | Chat, LLM |
| **Meditron** | 7B, 7B-Chat, 70B |
| **Meta-LlaMA-3** | 8B, 8B-Instruct, 70B, 70B-Instruct |
| **Mistral** | 7B-V0.1, 7B-Instruct-v0.2, 7B-OpenOrca |
| **Meta-LlaMA-3** | 3-8B, 3.1-8B, 3-8B-Instruct, 3.1-8B-Instruct, 3-70B, 3.1-70B, 3-70B-Instruct, 3.1-70B-Instruct |
| **Mistral** | 7B-V0.1, 7B-Instruct-v0.2, 7B-OpenOrca, Nemo-Instruct |
| **MistralLite** | 7B |
| **Mixtral** | 8x7B-v0.1, 8x7B-Dolphin-2.7, 8x7B-Instruct-v0.1 |
| **Neural-Chat** | 7B-v3.3 |
| **Notus** | 7B-v1 |
| **Notux** | 8x7b-v1 |
| **Nous-Hermes 2** | Mistral-7B-DPO, Mixtral-8x7B-DPO, Mistral-8x7B-SFT |
| **OpenChat** | 7B-v3.5-1210 |
| **OpenChat** | 7B-v3.5-1210? 8B-v3.6-20240522 |
| **OpenCodeInterpreter** | DS-6.7B, DS-33B, CL-7B, CL-13B, CL-70B |
| **OpenLLaMA** | 3B-v2, 7B-v2, 13B-v2 |
| **Orca 2** | 7B, 13B |
| **Phi 2** | 2.7B |
| **Phi 3** | mini-4k-instruct, medium-4k-instruct, medium-128k-instruct |
| **Phi** | 2-2.7B, 3-mini-4k-instruct, 3.1-mini-4k-instruct, 3.1-mini-128k-instruct, 3-medium-4k-instruct, 3-medium-128k-instruct |
| **Python Code** | 13B, 33B |
| **PsyMedRP** | 13B-v1, 20B-v1 |
| **Starling LM** | 7B-Alpha |
Expand Down
196 changes: 183 additions & 13 deletions api/src/serge/data/models.json
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
"models": [
{
"name": "BioMistral-7B",
"repo": "BioMistral/BioMistral-7B-GGUF",
"repo": "BioMistral/biomistral-7B-GGUF",
"files": [
{
"name": "q4_K_M",
Expand All @@ -30,7 +30,7 @@
]
}
]
},
},
{
"name": "Code",
"models": [
Expand Down Expand Up @@ -202,7 +202,18 @@
"disk_space": 4975385792.0
}
]
},
},
{
"name": "Falcon-11B",
"repo": "bartowski/falcon-11B-GGUF",
"files": [
{
"name": "q4_K_M",
"filename": "falcon-11B-Q4_K_M.gguf",
"disk_space": 6849675168.0
}
]
},
{
"name": "Falcon-40B",
"repo": "maddes8cht/tiiuae-falcon-40b-gguf",
Expand Down Expand Up @@ -273,6 +284,50 @@
"disk_space": 5329759200.0
}
]
},
{
"name": "Gemma-2-9B",
"repo": "mradermacher/gemma-2-9b-GGUF",
"files": [
{
"name": "q4_K_M",
"filename": "gemma-2-9b.Q4_K_M.gguf",
"disk_space": 5761058240.0
}
]
},
{
"name": "Gemma-2-9B-Instruct",
"repo": "bartowski/gemma-2-9b-it-GGUF",
"files": [
{
"name": "q4_K_M",
"filename": "gemma-2-9b-it-Q4_K_M.gguf",
"disk_space": 5761057728.0
}
]
},
{
"name": "Gemma-2-27B",
"repo": "mradermacher/gemma-2-27b-GGUF",
"files": [
{
"name": "q4_K_M",
"filename": "gemma-2-27b.Q4_K_M.gguf",
"disk_space": 16645382176.0
}
]
},
{
"name": "Gemma-2-27B-Instruct",
"repo": "bartowski/gemma-2-27b-it-GGUF",
"files": [
{
"name": "q4_K_M",
"filename": "gemma-2-27b-it-Q4_K_M.gguf",
"disk_space": 16645381632.0
}
]
}
]
},
Expand Down Expand Up @@ -482,7 +537,23 @@
]
}
]
},
},
{
"name": "Mathstral",
"models": [
{
"name": "Mathstral-7B",
"repo": "MaziyarPanahi/mathstral-7B-v0.1-GGUF",
"files": [
{
"name": "q4_K_M",
"filename": "mathstral-7B-v0.1.Q4_K_M.gguf",
"disk_space": 4372811584.0
}
]
}
]
},
{
"name": "Med42",
"models": [
Expand All @@ -496,7 +567,18 @@
"disk_space": 41422910368.0
}
]
}
},
{
"name": "Med42-v2-8B",
"repo": "mradermacher/Llama3-Med42-8B-GGUF",
"files": [
{
"name": "q4_K_M",
"filename": "Llama3-Med42-8B.Q4_K_M.gguf",
"disk_space": 4920734464.0
}
]
}
]
},
{
Expand Down Expand Up @@ -542,12 +624,12 @@
},
{
"name": "Medicine-LLM-13B",
"repo": "TheBloke/medicine-LLM-13B-GGUF",
"repo": "mradermacher/medicine-LLM-13B-GGUF",
"files": [
{
"name": "q4_K_M",
"filename": "medicine-llm-13b.Q4_K_M.gguf",
"disk_space": 7865963456.0
"filename": "medicine-LLM-13B.Q4_K_M.gguf",
"disk_space": 7865963936.0
}
]
}
Expand Down Expand Up @@ -605,6 +687,17 @@
}
]
},
{
"name": "Meta-Llama-3_1-8B",
"repo": "QuantFactory/Meta-Llama-3.1-8B-GGUF",
"files": [
{
"name": "q4_K_M",
"filename": "Meta-Llama-3.1-8B.Q4_K_M.gguf",
"disk_space": 4920733856.0
}
]
},
{
"name": "Meta-Llama-3-8B-Instruct",
"repo": "QuantFactory/Meta-Llama-3-8B-Instruct-GGUF",
Expand All @@ -616,6 +709,17 @@
}
]
},
{
"name": "Meta-Llama-3_1-8B-Instruct",
"repo": "QuantFactory/Meta-Llama-3.1-8B-Instruct-GGUF",
"files": [
{
"name": "q4_K_M",
"filename": "Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf",
"disk_space": 4920734240.0
}
]
},
{
"name": "Meta-Llama-3-70B",
"repo": "NousResearch/Meta-Llama-3-70B-GGUF",
Expand All @@ -627,6 +731,17 @@
}
]
},
{
"name": "Meta-Llama-3_1-70B",
"repo": "mradermacher/Meta-Llama-3.1-70B-GGUF",
"files": [
{
"name": "q4_K_M",
"filename": "Meta-Llama-3.1-70B.Q4_K_M.gguf",
"disk_space": 42520393600.0
}
]
},
{
"name": "Meta-Llama-3-70B-Instruct",
"repo": "QuantFactory/Meta-Llama-3-70B-Instruct-GGUF",
Expand All @@ -637,6 +752,17 @@
"disk_space": 42520906208.0
}
]
},
{
"name": "Meta-Llama-3_1-70B-Instruct",
"repo": "mradermacher/Meta-Llama-3.1-70B-Instruct-GGUF",
"files": [
{
"name": "q4_K_M",
"filename": "Meta-Llama-3.1-70B-Instruct.Q4_K_M.gguf",
"disk_space": 42520394080.0
}
]
}
]
},
Expand All @@ -655,13 +781,13 @@
]
},
{
"name": "Mistral-7B-Instruct-v0_2",
"repo": "TheBloke/Mistral-7B-Instruct-v0.2-GGUF",
"name": "Mistral-7B-Instruct-v0_3",
"repo": "rubra-ai/Mistral-7B-Instruct-v0.3-GGUFF",
"files": [
{
"name": "q4_K_M",
"filename": "mistral-7b-instruct-v0.2.Q4_K_M.gguf",
"disk_space": 4368439584.0
"filename": "rubra-mistral-7b-instruct-v0.3.Q4_K_M.gguf",
"disk_space": 4896118816.0
}
]
},
Expand All @@ -675,6 +801,17 @@
"disk_space": 4368450304.0
}
]
},
{
"name": "Mistral-Nemo-Instruct",
"repo": "bartowski/Mistral-Nemo-Instruct-2407-GGUF",
"files": [
{
"name": "q4_K_M",
"filename": "Mistral-Nemo-Instruct-2407-Q4_K_M.gguf",
"disk_space": 7477204960.0
}
]
}
]
},
Expand Down Expand Up @@ -831,7 +968,18 @@
"disk_space": 4368450688.0
}
]
}
},
{
"name": "OpenChat-3_6-8B-20240522",
"repo": "bartowski/openchat-3.6-8b-20240522-GGUF",
"files": [
{
"name": "q4_K_M",
"filename": "openchat-3.6-8b-20240522-Q4_K_M.gguf",
"disk_space": 4920734496.0
}
]
},
]
},
{
Expand Down Expand Up @@ -984,6 +1132,28 @@
}
]
},
{
"name": "Phi-3_1-mini-4k-instruct",
"repo": "bartowski/Phi-3.1-mini-4k-instruct-GGUF",
"files": [
{
"name": "q4_K_M",
"filename": "Phi-3.1-mini-4k-instruct-Q4_K_M.gguf",
"disk_space": 2393232096.0
}
]
},
{
"name": "Phi-3_1-mini-128k-instruct",
"repo": "bartowski/Phi-3.1-mini-128k-instruct-GGUF",
"files": [
{
"name": "q4_K_M",
"filename": "Phi-3.1-mini-128k-instruct-Q4_K_M.gguf",
"disk_space": 2393232640.0
}
]
},
{
"name": "Phi-3-medium-4k-instruct",
"repo": "bartowski/Phi-3-medium-4k-instruct-GGUF",
Expand Down
2 changes: 1 addition & 1 deletion scripts/serge.env
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
LLAMA_PYTHON_VERSION=0.2.82
LLAMA_PYTHON_VERSION=0.2.86
SERGE_ENABLE_IPV4=true
SERGE_ENABLE_IPV6=false
Loading