diff --git a/docs/inference_on_multiple_gpus.md b/docs/inference_on_multiple_gpus.md
index acd0dbf..cce7581 100644
--- a/docs/inference_on_multiple_gpus.md
+++ b/docs/inference_on_multiple_gpus.md
@@ -29,8 +29,13 @@ from accelerate import init_empty_weights, infer_auto_device_map, load_checkpoin
 
 2. Download model weights.
 
+```bash
+# Make sure you have git-lfs installed (https://git-lfs.com)
+git lfs install
+git clone https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5
+```
 ```python
-MODEL_PATH = '/local/path/to/MiniCPM-Llama3-V-2_5' # you can download in advance or use `openbmb/MiniCPM-Llama3-V-2_5`
+MODEL_PATH = '/local/path/to/MiniCPM-Llama3-V-2_5'
 ```
 
 3. Determine the distribution of layers on multiple GPUs.