From 14947a93aa04b660fbc8d4031c69bc5d060b17d9 Mon Sep 17 00:00:00 2001 From: Walter van Heuven Date: Wed, 15 May 2024 20:37:38 +0100 Subject: [PATCH] Updated --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index ad86d3a..72f2f7f 100644 --- a/README.md +++ b/README.md @@ -8,10 +8,10 @@ The script currently supports the following models: - [CogVLM](https://github.com/THUDM/CogVLM) - [Kosmos-2](https://github.com/microsoft/unilm/tree/master/kosmos-2) - [OpenCLIP](https://github.com/mlfoundations/open_clip) -- [GPT-4o](https://openai.com/index/hello-gpt-4o/) and GPT-4 Turbo +- OpenAI's [GPT-4o](https://openai.com/index/hello-gpt-4o/) and GPT-4 Turbo - Multimodal models, such as [LLaVA](https://llava-vl.github.io) are supported through [Ollama](https://ollama.com) -All models, except GPT-4V, run locally. GPT-4V requires API access. By default, images are resized so that width and height are maximum 500 pixels before inference. The [Qwen-VL](https://github.com/QwenLM/Qwen-VL) model requires an NVIDIA RTX A4000 (or better), or an M1-Max or better. For inference hardware requirements of Cog-VLM, check the [github page](https://github.com/THUDM/CogVLM). +All models, except OpenAI's models (e.g., GPT-4o), run locally. OpenAI's models requires API access. By default, images are resized so that width and height are maximum 500 pixels before inference. The [Qwen-VL](https://github.com/QwenLM/Qwen-VL) model requires an NVIDIA RTX A4000 (or better), or an M1-Max or better. For inference hardware requirements of Cog-VLM, check the [github page](https://github.com/THUDM/CogVLM). ## Setup