From 3aa462e00bef6da3a5445a4588dda2e267f3eac1 Mon Sep 17 00:00:00 2001 From: Haotian Tang Date: Fri, 3 May 2024 10:49:39 -0400 Subject: [PATCH 1/3] Update README.md --- README.md | 9 +++++---- demo_trt_llm/README.md | 6 +++--- 2 files changed, 8 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index a55efa8b..7d30fb60 100644 --- a/README.md +++ b/README.md @@ -16,8 +16,9 @@ VILA is a visual language model (VLM) pretrained with interleaved image-text dat ## 💡 News -- [2024/05] We release [AWQ](https://arxiv.org/pdf/2306.00978.pdf)-quantized 4bit VILA-1.5 models supported by [TinyChat](https://github.com/mit-han-lab/llm-awq/tree/main/tinychat) and [TensorRT-LLM](demo_trt_llm) backends. -- [2024/05] We release VILA-1.5, which comes with four model sizes (3B/8B/13B/40B) and offers native support for multi-image and video understanding. +- [2024/05] We release VILA-1.5, which offers **video understanding support** and comes with four model sizes (3B/8B/13B/40B). +- [2024/05] We release [AWQ](https://arxiv.org/pdf/2306.00978.pdf)-quantized 4bit VILA-1.5 models supported by [TinyChat](https://github.com/mit-han-lab/llm-awq/tree/main/tinychat) and [TensorRT-LLM](demo_trt_llm) backends. +- [2024/03] VILA has been accepted by CVPR 2024! - [2024/02] We release [AWQ](https://arxiv.org/pdf/2306.00978.pdf)-quantized 4bit VILA models, deployable on Jetson Orin and laptops through [TinyChat](https://github.com/mit-han-lab/llm-awq/tree/main/tinychat) and [TinyChatEngine](https://github.com/mit-han-lab/TinyChatEngine). - [2024/02] VILA is released. We propose interleaved image-text pretraining that enables multi-image VLM. VILA comes with impressive in-context learning capabilities. We open source everything: including training code, evaluation code, datasets, model ckpts. - [2023/12] [Paper](https://arxiv.org/abs/2312.07533) is on Arxiv! @@ -224,7 +225,7 @@ python -W ignore llava/eval/run_vila.py \ --model-path Efficient-Large-Model/VILA1.5-3b \ --conv-mode vicuna_v1 \ --query "