ComfyUI-Ovis2

A ComfyUI custom node set for integrating Ovis2, a powerful multimodal large language model designed to analyze images and videos.

Features

Image Captioning: Generate detailed descriptions of images
Multi-Image Analysis: Compare and analyze up to 4 images simultaneously
Video Description: Process video frames for scene understanding
Auto-Download: Automatically download models from Hugging Face
Multiple Models: Support for all Ovis2 model sizes (1B to 34B parameters)

Installation

Option 1: Using ComfyUI Manager (Recommended)

Install ComfyUI Manager if you haven't already
Open ComfyUI Manager
Go to "Install Custom Nodes" tab
Click "Install from Git URL"
Enter the GitHub repository URL
Click Install

Option 2: Manual Installation

Navigate to your ComfyUI installation folder
Go to the custom_nodes directory (create it if it doesn't exist)

Clone this repository:

git clone https://github.com/Andro-Meta/ComfyUI-Ovis2.git

Install the required dependencies:

pip install -r custom_nodes/ComfyUI-Ovis2/requirements.txt

Restart ComfyUI

Dependencies

transformers>=4.46.2
huggingface-hub>=0.23.0
torch>=2.4.0
pillow>=10.3.0
flash-attn>=2.7.0
numpy>=1.25.0

Usage

After installation, you'll find four new nodes in the "Ovis2" category:

Load Ovis2 Model

Loads the Ovis2 model with configurable settings:

model_name: Choose which Ovis2 model to load
precision: Set numerical precision
max_token_length: Maximum context length
device: Choose CPU or CUDA for inference
auto_download: Enable or disable automatic model downloading

Ovis2 Image Caption

Generates detailed descriptions of images:

model: Connect to the Ovis2 model
image: Connect to an image input
prompt: Instructions for the model
max_new_tokens: Maximum length of generated text
temperature: Controls randomness

Ovis2 Multi-Image Analysis

Analyzes multiple images together:

Supports up to 4 images simultaneously
Great for comparison or sequence analysis

Ovis2 Video Frames Description

Processes video frames:

Works with ComfyUI's standard video frame output format
Controls for frame_skip and max_frames to handle longer videos

Example Workflows

Basic Image Captioning

Add a "Load Image" node and select an image
Add a "Load Ovis2 Model" node and choose your preferred model size
Add an "Ovis2 Image Caption" node
Connect:
- The image output to the "image" input on the caption node
- The model output to the "model" input on the caption node
Run the workflow to generate a detailed caption

Multi-Image Comparison

Load two or more images using "Load Image" nodes
Add a "Load Ovis2 Model" node
Add an "Ovis2 Multi-Image Analysis" node
Connect:
- The model output to the "model" input
- Each image to the corresponding image inputs
- Set a prompt like "Compare these images and describe their similarities and differences"
Run the workflow to get a comparative analysis

Model Storage

Models are stored in the models/ovis directory inside your ComfyUI installation. The nodes will automatically create this directory if it doesn't exist.

Troubleshooting

Memory Issues

If you encounter CUDA out of memory errors, try:

Using a smaller model (Ovis2-1B or Ovis2-2B)
Reducing the image size before processing
Switching to "float16" precision
Reducing max_token_length

Model Loading Errors

Check if auto_download is enabled
Ensure you have a proper internet connection during first run
Check if the model files are already downloaded to the correct location

Import Errors

Verify that all dependencies are correctly installed
Check the ComfyUI console for specific error messages

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

AIDC-AI for creating the Ovis2 models
ComfyUI for the amazing stable diffusion interface

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Readme.md		Readme.md
__init__.py		__init__.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ComfyUI-Ovis2

Features

Installation

Option 1: Using ComfyUI Manager (Recommended)

Option 2: Manual Installation

Dependencies

Usage

Load Ovis2 Model

Ovis2 Image Caption

Ovis2 Multi-Image Analysis

Ovis2 Video Frames Description

Example Workflows

Basic Image Captioning

Multi-Image Comparison

Model Storage

Troubleshooting

Memory Issues

Model Loading Errors

Import Errors

License

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

ComfyNodePRs/PR-ComfyUI-Ovis2-d61e9383

Folders and files

Latest commit

History

Repository files navigation

ComfyUI-Ovis2

Features

Installation

Option 1: Using ComfyUI Manager (Recommended)

Option 2: Manual Installation

Dependencies

Usage

Load Ovis2 Model

Ovis2 Image Caption

Ovis2 Multi-Image Analysis

Ovis2 Video Frames Description

Example Workflows

Basic Image Captioning

Multi-Image Comparison

Model Storage

Troubleshooting

Memory Issues

Model Loading Errors

Import Errors

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages