From 1ca4f79fbfa8ba3355fa33c560336ff7d8f2b3fe Mon Sep 17 00:00:00 2001
From: Glenn Fernandes <19950242+glenn124f@users.noreply.github.com>
Date: Sat, 22 Jun 2024 12:15:03 -0500
Subject: [PATCH] Update inference_on_multiple_gpus.md

Previous instruction for downloading weights was :
"you can download in advance or use `openbmb/MiniCPM-Llama3-V-2_5`"

Using `openbmb/MiniCPM-Llama3-V-2_5` directly does not provide all the necessary files to run this code, for e.g. resampler.py and the index json
---
 docs/inference_on_multiple_gpus.md | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/docs/inference_on_multiple_gpus.md b/docs/inference_on_multiple_gpus.md
index acd0dbf..cce7581 100644
--- a/docs/inference_on_multiple_gpus.md
+++ b/docs/inference_on_multiple_gpus.md
@@ -29,8 +29,13 @@ from accelerate import init_empty_weights, infer_auto_device_map, load_checkpoin
 
 2. Download model weights.
 
+```bash
+# Make sure you have git-lfs installed (https://git-lfs.com)
+git lfs install
+git clone https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5
+```
 ```python
-MODEL_PATH = '/local/path/to/MiniCPM-Llama3-V-2_5' # you can download in advance or use `openbmb/MiniCPM-Llama3-V-2_5`
+MODEL_PATH = '/local/path/to/MiniCPM-Llama3-V-2_5'
 ```
 
 3. Determine the distribution of layers on multiple GPUs.