You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Update OpenAI compatibility script for Azure integration
- Set environment variables for Azure OpenAI API configuration.
- Modify model arguments to use the new GPT-4o model version and enable Azure OpenAI support.
- Clean up commented installation instructions for clarity.
* Refactor imports and clean up code in various utility files
- Consolidated import statements in `plm.py` and `utils.py` for better organization.
- Removed redundant blank lines in `eval_utils.py`, `fgqa_utils.py`, `rcap_utils.py`, `rdcap_utils.py`, `rtloc_utils.py`, and `sgqa_utils.py` to enhance readability.
- Ensured consistent import structure across utility files for improved maintainability.
* Update README and add example scripts for model evaluations
- Revised installation instructions to facilitate direct package installation from Git.
- Added detailed usage examples for various models including Aria, LLaVA, and Qwen2-VL.
- Introduced new example scripts for model evaluations, enhancing user guidance for running specific tasks.
- Improved clarity in environmental variable setup and common issues troubleshooting sections.
* Update README to reflect new example script locations and remove outdated evaluation instructions
- Changed paths for model evaluation scripts to point to the new `examples/models` directory.
- Added a note directing users to find more examples in the updated location.
- Removed outdated evaluation instructions for LLaVA on multiple datasets to streamline the documentation.
* Update README to reflect new script locations and enhance evaluation instructions
- Replaced outdated evaluation commands with new script paths in the `examples/models` directory.
- Updated sections for evaluating larger models, including the introduction of new scripts for tensor parallel and SGLang evaluations.
- Streamlined instructions for model evaluation to improve clarity and usability.
Our Development will be continuing on the main branch, and we encourage you to give us feedback on what features are desired and how to improve the library further, or ask questions, either in issues or PRs on GitHub.
122
115
123
-
## Multiple Usages
116
+
## Usages
124
117
125
-
**Evaluation of LLaVA on MME**
118
+
> More examples can be found in [examples/models](examples/models)
**For other variants llava. Please change the `conv_template` in the `model_args`**
156
-
157
-
> `conv_template` is an arg of the init function of llava in `lmms_eval/models/llava.py`, you could find the corresponding value at LLaVA's code, probably in a dict variable `conv_templates` in `llava/conversations.py`
**Evaluation with vLLM for bigger model (llava-next-72b)**
173
+
174
+
```bash
175
+
bash examples/models/vllm_qwen2vl.sh
176
+
```
177
+
178
+
**More Parameters**
179
+
180
+
```bash
181
+
python3 -m lmms_eval --help
182
+
```
183
+
184
+
**Environmental Variables**
185
+
Before running experiments and evaluations, we recommend you to export following environment variables to your environment. Some are necessary for certain tasks to run.
186
+
187
+
```bash
188
+
export OPENAI_API_KEY="<YOUR_API_KEY>"
189
+
export HF_HOME="<Path to HF cache>"
190
+
export HF_TOKEN="<YOUR_API_KEY>"
191
+
export HF_HUB_ENABLE_HF_TRANSFER="1"
192
+
export REKA_API_KEY="<YOUR_API_KEY>"
193
+
# Other possible environment variables include
194
+
# ANTHROPIC_API_KEY,DASHSCOPE_API_KEY etc.
195
+
```
196
+
197
+
**Common Environment Issues**
198
+
199
+
Sometimes you might encounter some common issues for example error related to httpx or protobuf. To solve these issues, you can first try
200
+
201
+
```bash
202
+
python3 -m pip install httpx==0.23.3;
203
+
python3 -m pip install protobuf==3.20;
204
+
# If you are using numpy==2.x, sometimes may causing errors
205
+
python3 -m pip install numpy==1.26;
206
+
# Someties sentencepiece are required for tokenizer to work
207
+
python3 -m pip install sentencepiece;
231
208
```
232
209
233
210
## Add Customized Model and Dataset
@@ -244,12 +221,6 @@ Below are the changes we made to the original API:
244
221
- Build context now only pass in idx and process image and doc during the model responding phase. This is due to the fact that dataset now contains lots of images and we can't store them in the doc like the original lm-eval-harness other wise the cpu memory would explode.
245
222
- Instance.args (lmms_eval/api/instance.py) now contains a list of images to be inputted to lmms.
246
223
- lm-eval-harness supports all HF language models as single model class. Currently this is not possible of lmms because the input/output format of lmms in HF are not yet unified. Thererfore, we have to create a new class for each lmms model. This is not ideal and we will try to unify them in the future.
247
-
248
-
---
249
-
250
-
During the initial stage of our project, we thank:
251
-
-[Xiang Yue](https://xiangyue9607.github.io/), [Jingkang Yang](https://jingkang50.github.io/), [Dong Guo](https://www.linkedin.com/in/dongguoset/) and [Sheng Shen](https://sincerass.github.io/) for early discussion and testing.
0 commit comments