Awesome Hallucination Papers in MLLMs

A curated list of papers about hallucination in multi-modal large language models (MLLMs)

Survey Papers

This section collects the survey papers about MLLM's hallucination.

A Survey on Hallucination in Large Vision-Language Models [paper]

Arxiv 2024/02

Benchmark Papers

This section collects the benchmark papers on evaluating MLLM's hallucination.

Evaluating Object Hallucination in Large Vision-Language Models [paper] [code]

EMNLP 2023
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models [paper] [code]

CVPR 2024
Aligning Large Multimodal Models with Factually Augmented RLHF [paper] [code]

Arxiv 2023/09
An LLM-free Multi-dimensional Benchmark for MLLMs Hallucination Evaluation [paper] [code]

Arxiv 2023/11
Holistic Analysis of Hallucination in GPT-4V(ision): Bias and Interference Challenges [paper] [code]

Arxiv 2023/11
Hallucination Benchmark in Medical Visual Question Answering [paper]

Arxiv 2024/01
The Instinctive Bias: Spurious Images lead to Hallucination in MLLMs [paper] [code]

Arxiv 2024/02
Unified Hallucination Detection for Multimodal Large Language Models [paper] [code]

Arxiv 2024/02
Visual Hallucinations of Multi-modal Large Language Models [paper] [code]

Arxiv 2024/02
Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models [paper]

Arxiv 2024/02
PhD: A Prompted Visual Hallucination Evaluation Dataset [paper] [code]

Arxiv 2024/03
Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models [paper] [code]

Arxiv 2024/04
THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models [paper]

Arxiv 2024/05

Hallucination Mitigation

This section collects the papers on mitigating the MLLM's hallucination.

Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning [paper] [code]

ICLR 2024
Analyzing and Mitigating Object Hallucination in Large Vision-Language Models [paper] [code]

ICLR 2024
VIGC: Visual Instruction Generation and Correction [paper][code]

AAAI 2024
OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation [paper] [code]

CVPR 2024
Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding [paper] [code]

CVPR 2024
Hallucination Augmented Contrastive Learning for Multimodal Large Language Model [paper]

CVPR 2024
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback [paper] [code]

CVPR 2024
Detecting and Preventing Hallucinations in Large Vision Language Models [paper]

Arxiv 2023/08
Evaluation and Analysis of Hallucination in Large Vision-Language Models [paper][code]

Arxiv 2023/08
CIEM: Contrastive Instruction Evaluation Method for Better Instruction Tuning [paper]

Arxiv 2023/09
Evaluation and Mitigation of Agnosia in Multimodal Large Language Models [paper]

Arxiv 2023/09
Aligning Large Multimodal Models with Factually Augmented RLHF [paper] [code]

Arxiv 2023/09
HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption [paper]

Arxiv 2023/10
Woodpecker: Hallucination Correction for Multimodal Large Language Models [paper] [code]

Arxiv 2023/10
HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data [paper] [code]

Arxiv 2023/11
VOLCANO: Mitigating Multimodal Hallucination through Self-Feedback Guided Revision [paper] [code]

Arxiv 2023/11
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization [paper]

Arxiv 2023/11
Mitigating Hallucination in Visual Language Models with Visual Supervision [paper]

Arxiv 2023/11
Mitigating Fine-Grained Hallucination by Fine-Tuning Large Vision-Language Models with Caption Rewrites [paper] [code]

Arxiv 2023/12
MOCHa: Multi-Objective Reinforcement Mitigating Caption Hallucinations [paper] [code]

Arxiv 2023/12
Temporal Insight Enhancement: Mitigating Temporal Hallucination in Multimodal Large Language Models [paper]

Arxiv 2024/01
On the Audio Hallucinations in Large Audio-Video Language Models [paper]

Arxiv 2024/01
Skip \n: A simple method to reduce hallucination in Large Vision-Language Models [paper]

Arxiv 2024/02
Unified Hallucination Detection for Multimodal Large Language Models [paper] [code]

Arxiv 2024/02
Mitigating Object Hallucination in Large Vision-Language Models via Classifier-Free Guidance [paper]

Arxiv 2024/02
EFUF: Efficient Fine-grained Unlearning Framework for Mitigating Hallucinations in Multimodal Large Language Models [paper]

Arxiv 2024/02
Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models [paper] [code]

Arxiv 2024/02
Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective [paper] [code]

Arxiv 2024/02
Seeing is Believing: Mitigating Hallucination in Large Vision-Language Models via CLIP-Guided Decoding [paper]

Arxiv 2024/02
IBD: Alleviating Hallucinations in Large Vision-Language Models via Image-Biased Decoding [paper]

Arxiv 2024/02
HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding [paper] [code]

Arxiv 2024/03
Evaluating and Mitigating Number Hallucinations in Large Vision-Language Models: A Consistency Perspective [paper]

Arxiv 2024/03
Debiasing Large Visual Language Models [paper]

Arxiv 2024/03
AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language Models [paper]

Arxiv 2024/03
What if...?: Counterfactual Inception to Mitigate Hallucination Effects in Large Multimodal Models [paper]

Arxiv 2024/03
Multi-Modal Hallucination Control by Visual Information Grounding [paper]

Arxiv 2024/03
Pensieve: Retrospect-then-Compare Mitigates Visual Hallucination [paper] [code]

Arxiv 2024/03
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art [paper]

Arxiv 2024/03
Cartoon Hallucinations Detection: Pose-aware In Context Visual Learning [paper]

Arxiv 2024/03
Visual Hallucination: Definition, Quantification, and Prescriptive Remediations [paper]

Arxiv 2024/03
Exploiting Semantic Reconstruction to Mitigate Hallucinations in Vision-Language Models [paper]

Arxiv 2024/03
Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding [paper]

Arxiv 2024/03
Automated Multi-level Preference for MLLMs [paper]

Arxiv 2024/05
CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models [paper]

Arxiv 2024/05
VDGD: Mitigating LVLM Hallucinations in Cognitive Prompts by Bridging the Visual Perception Gap [paper]

Arxiv 2024/05
Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization [paper]

Arxiv 2024/05
Mitigating Dialogue Hallucination for Large Vision Language Models via Adversarial Instruction Tuning [paper]

Arxiv 2024/05
RITUAL: Random Image Transformations as a Universal Anti-hallucination Lever in LVLMs [paper]

Arxiv 2024/05
MetaToken: Detecting Hallucination in Image Descriptions by Meta Classification [paper]

Arxiv 2024/05
Mitigating Object Hallucination via Data Augmented Contrastive Tuning [paper]

Arxiv 2024/05
NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models [paper] [code]

Arxiv 2024/06
CODE: Contrasting Self-generated Description to Combat Hallucination in Large Multi-modal Models [paper] [code]

Arxiv 2024/06
Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models [paper]

Arxiv 2024/06
Detecting and Evaluating Medical Hallucinations in Large Vision Language Models [paper]

Arxiv 2024/06
AUTOHALLUSION: Automatic Generation of Hallucination Benchmarks for Vision-Language Models [paper]

Arxiv 2024/06
Hallucination Mitigation Prompts Long-term Video Understanding [paper] [code]

Arxiv 2024/06
Do More Details Always Introduce More Hallucinations in LVLM-based Image Captioning? [paper]

Arxiv 2024/06
Does Object Grounding Really Reduce Hallucination of Large Vision-Language Models? [paper]

Arxiv 2024/06
VGA: Vision GUI Assistant - Minimizing Hallucinations through Image-Centric Fine-Tuning [paper]

Arxiv 2024/06
AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention [paper] [code]

Arxiv 2024/06
Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models [paper] [code]

Arxiv 2024/06
VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models [paper] [code]

Arxiv 2024/06
Evaluating the Quality of Hallucination Benchmarks for Large Vision-Language Models [paper] [code]

Arxiv 2024/06

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Hallucination Papers in MLLMs

Survey Papers

Benchmark Papers

Hallucination Mitigation

About

Releases

Packages

Contributors 2

shikiw/Awesome-MLLM-Hallucination

Folders and files

Latest commit

History

Repository files navigation

Awesome Hallucination Papers in MLLMs

Survey Papers

Benchmark Papers

Hallucination Mitigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages