Skip to content

E2E Evaluation Pipeline for ONLY RAG. Benchmark to BERGEN from NAVER Labs (a.k.a. BERGEN UP✨)

License

Notifications You must be signed in to change notification settings

ash-hun/BERGEN-UP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BERGEN-UP

New version of BERGEN (a.k.a BERGEN UP✨)

BERGEN (BEnchmarking Retrieval-augmented GENeration) is a library designed to benchmark RAG systems with a focus on question-answering (QA) by NAVER Labs. It addresses the challenge of inconsistent benchmarking in comparing approaches and understanding the impact of each component in a RAG pipeline. Unlike BERGEN, BERGEN-UP is an end-to-end evaluation pipeline that enhanced focuses on the diversity of RAG pipelines and the functionality of each modules.

🍒 Key Points

  • E2E Evaluation Pipeline for RAG
    • Chunking
      • token level
        • recall
        • precision
        • iou
    • Pre-Retrieval
    • Retrieval
    • Post-Retrieval
    • Generation

🥑 How to run pipeline?

1. Write your evaluation in conf/config.yaml
2. Run only below script
$ uv run pipeline.py label='__experiments_name__'

About

E2E Evaluation Pipeline for ONLY RAG. Benchmark to BERGEN from NAVER Labs (a.k.a. BERGEN UP✨)

Topics

Resources

License

Stars

Watchers

Forks

Languages