Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/main' into rag-task-batching
Browse files Browse the repository at this point in the history
  • Loading branch information
hummerichsander committed Jul 18, 2024
2 parents 915c132 + dc296f2 commit 52efbb6
Show file tree
Hide file tree
Showing 4 changed files with 167 additions and 151 deletions.
7 changes: 7 additions & 0 deletions KNOWLEDGE.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,13 @@
- Medium quality, good performance, medium resources
- <https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF/resolve/main/mixtral-8x7b-instruct-v0.1.Q5_K_M.gguf>

### Llama-3-SauerkrautLM-70b-Instruct

- Good quality, medium performance, high resources
- <https://huggingface.co/redponike/Llama-3-SauerkrautLM-70b-Instruct-GGUF>

### Still to test

- <https://huggingface.co/LoneStriker/OpenBioLLM-Llama3-8B-GGUF>
- <https://huggingface.co/LoneStriker/OpenBioLLM-Llama3-70B-GGUF>
- <https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual-gguf/resolve/main/ggml-model-Q8_0.gguf>
2 changes: 1 addition & 1 deletion compose/docker-compose.prod.yml
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ services:
- 9610:8080
volumes:
- models_data:/models
entrypoint: "/bin/bash -c '/llama-server -mu $${LLM_MODEL_URL} -ngl 50 -cb -c 4096 --host 0.0.0.0 --port 8080'"
entrypoint: "/bin/bash -c '/llama-server -mu $${LLM_MODEL_URL} -ngl 99 -cb -c 4096 --host 0.0.0.0 --port 8080'"
deploy:
# <<: *deploy
resources:
Expand Down
Loading

0 comments on commit 52efbb6

Please sign in to comment.