Skip to content

Commit

Permalink
remove
Browse files Browse the repository at this point in the history
  • Loading branch information
lvhan028 committed Aug 14, 2023
1 parent 92b326e commit 59fd0a3
Showing 1 changed file with 0 additions and 13 deletions.
13 changes: 0 additions & 13 deletions docs/en/serving.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,19 +34,6 @@ bash workspace/service_docker_up.sh

</details>

<details open>
<summary><b>7B with INT4 weight only quantization</b></summary>

```shell
python3 -m lmdeploy.serve.turbomind.deploy llama2 /path/to/llama-2-7b-chat-hf \
--model_format awq \
--group_size 128 \
--quant_path /path/to/awq-quant-weight.pt
bash workspace/service_docker_up.sh
```

</details>

## Serving [LLaMA](https://github.com/facebookresearch/llama)

Weights for the LLaMA models can be obtained from by filling out [this form](https://docs.google.com/forms/d/e/1FAIpQLSfqNECQnMkycAp2jP4Z9TFX0cGR4uf7b_fBxjY_OjhJILlKGA/viewform)
Expand Down

0 comments on commit 59fd0a3

Please sign in to comment.