Skip to content

Integrating YOLOv (You Only Look Once) with Large Language Models (LLMs) for Enhanced Object Detection and Contextual Understanding. This project combines state-of-the-art object detection with advanced language processing to improve accuracy and provide detailed context for detected objects.

Notifications You must be signed in to change notification settings

AGAMPANDEYY/Visual_QA_Bot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Yolov_LLM

Video Object Detection (Yolov5) and Multimodal Vision Language Model (Llava 13b)

Integrating YOLOv (You Only Look Once) with Large Language Models (LLMs) for Enhanced Object Detection and Contextual Understanding. This project combines state-of-the-art object detection with advanced language processing to improve accuracy and provide detailed context for detected objects. Ideal for applications in autonomous systems, surveillance, and AI-driven analytics.

HuggingFace Hub uploaded a fine-tuned model checkpoints- https://huggingface.co/AgamP/LLM_Custom_1/tree/main

The proposal, Project report and summary is available above, follow to understand the context.

About

Integrating YOLOv (You Only Look Once) with Large Language Models (LLMs) for Enhanced Object Detection and Contextual Understanding. This project combines state-of-the-art object detection with advanced language processing to improve accuracy and provide detailed context for detected objects.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published