Skip to content

Conversation

@ofermend
Copy link
Collaborator

@ofermend ofermend commented Nov 8, 2025

Summary

  • Added configurable question type distribution control for query generation
  • Users can now customize the percentage of each question type (directly answerable, reasoning required, unanswerable, partially answerable)
  • Weights are automatically normalized and validated
  • Setting a weight to 0 disables that question type entirely
  • Updated documentation and configuration examples

Changes

Core Implementation (open_rag_eval/query_generation/llm_generator.py:27-376)

  • Added question_type_weights parameter to LLMQueryGenerator.__init__() with configurable distribution
  • Implemented weight validation to ensure non-negative values and at least one active type
  • Implemented weight normalization to convert raw weights to percentages (sum to 100%)
  • Added dynamic prompt building based on enabled question types and their percentages
  • Improved questions-per-doc calculation with 1.5x buffer for deduplication/filtering

Configuration & Documentation

  • Updated README.md with comprehensive examples and explanation of the feature
  • Added question_types section to all config examples (CSV, local, Vectara)
  • Documented how to disable question types by setting weight to 0

Integration (open_rag_eval/run_query_generation.py:201-237)

  • Integrated question type weights from config into generator initialization
  • Added OmegaConf to dict conversion for proper weight handling
  • Simplified document loading parameter passing

Testing (tests/query_generation/test_llm_generator.py)

  • Added 10 comprehensive test cases covering:
    • Default weight behavior (equal distribution)
    • Custom weight normalization
    • Auto-normalization of arbitrary weights
    • Disabling question types with zero weights
    • Error handling for invalid weights (all zeros, negative values, invalid keys)
    • Prompt generation with enabled/disabled types
    • Partial weight specification

@ofermend ofermend requested a review from Copilot November 8, 2025 23:55
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds configurable question type weighting to the LLM query generator, allowing users to control the distribution of different question types (directly answerable, reasoning required, unanswerable, and partially answerable) in the generated queries.

Key changes:

  • Added question_type_weights parameter to LLMQueryGenerator with weight validation and normalization
  • Replaced hardcoded question type prompts with dynamically generated instructions based on configured weights
  • Updated configuration examples and documentation to demonstrate the new feature

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
open_rag_eval/query_generation/llm_generator.py Implements question type weighting with validation, normalization, and dynamic prompt generation
open_rag_eval/run_query_generation.py Integrates question type weights from config, converts OmegaConf to dict, and refactors to inline parameter passing
tests/query_generation/test_llm_generator.py Adds comprehensive test coverage for weight validation, normalization, partial specification, and prompt generation
config_examples/query_generation_vectara.yaml Adds example configuration for question type weights
config_examples/query_generation_local.yaml Adds example configuration with additional commented example showing how to disable types
config_examples/query_generation_csv.yaml Adds example configuration for question type weights
README.md Documents the new question type customization feature with examples
Comments suppressed due to low confidence (1)

open_rag_eval/run_query_generation.py:1

  • The magic number 1.5 is unexplained. Consider extracting this as a named constant (e.g., DEDUPLICATION_BUFFER_MULTIPLIER = 1.5) to improve code clarity and make it easier to adjust if needed.
"""Main orchestration script for query generation."""

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ofermend ofermend requested review from Copilot and vish119 November 11, 2025 02:22
Copilot finished reviewing on behalf of ofermend November 11, 2025 02:24
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

total = sum(full_weights.values())
if total == 0:
raise ValueError("Cannot normalize weights: all weights are zero")
return {key: (value / total) * 100 for key, value in full_weights.items()}
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing blank line between method definitions. According to PEP 8, there should be two blank lines between method definitions in a class.

Suggested change
return {key: (value / total) * 100 for key, value in full_weights.items()}
return {key: (value / total) * 100 for key, value in full_weights.items()}

Copilot uses AI. Check for mistakes.
Comment on lines +103 to +108
# Check that at least one weight is positive
if all(v == 0 for v in weights.values()):
raise ValueError(
"At least one question type weight must be greater than 0"
)

Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent validation behavior for partial weight specifications. The _validate_weights method at line 104 checks if all values in the provided weights dict are zero, but _normalize_weights fills missing keys with 0. This means if a user provides {'directly_answerable': 0} (only one key), it will pass validation (since it's checking the provided dict), but when normalized, all four types will have 0 weight, causing the normalization to fail with "Cannot normalize weights: all weights are zero" at line 124. The validation should check the filled weights dict to provide a consistent error message.

Suggested change
# Check that at least one weight is positive
if all(v == 0 for v in weights.values()):
raise ValueError(
"At least one question type weight must be greater than 0"
)
# Check that at least one weight is positive (after filling missing keys with 0)
full_weights = {key: weights.get(key, 0) for key in valid_keys}
if all(v == 0 for v in full_weights.values()):
raise ValueError(
"At least one question type weight must be greater than 0"
)

Copilot uses AI. Check for mistakes.
Comment on lines +365 to +366
self.assertEqual(percentages.get('unanswerable', 0), 0)
self.assertEqual(percentages.get('partially_answerable', 0), 0)
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The .get() method with default value is unnecessary here. The _normalize_weights method always returns all four question type keys (see lines 119-121 in llm_generator.py where it fills in missing keys). These assertions can be simplified to self.assertEqual(percentages['unanswerable'], 0) and self.assertEqual(percentages['partially_answerable'], 0).

Suggested change
self.assertEqual(percentages.get('unanswerable', 0), 0)
self.assertEqual(percentages.get('partially_answerable', 0), 0)
self.assertEqual(percentages['unanswerable'], 0)
self.assertEqual(percentages['partially_answerable'], 0)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants