Replacing Pandas in EvaluationRunResult with CustomDataFrame #8793 #8838

blue-gitty · 2025-02-10T22:39:44Z

Related Issues

fixes Make EvaluationRunResult work without pandas #8793

Proposed Changes:

The aim was to make EvaluationRunResult work without pandas.

The goal was to refactor the EvaluationRunResult class to function without relying on pandas. While pandas.DataFrame is a powerful tool for handling tabular data, it was not a strict requirement for this use case. Instead, I implemented a solution using simple Python dictionaries, which are lightweight and sufficient for the task.

However, since the existing unit tests for EvaluationRunResult were designed around pandas.DataFrame behavior, I couldn’t directly replace it with dictionaries. To bridge this gap, I created a custom class called CustomDataFrame. This class mimics the essential functionalities of pandas.DataFrame is built entirely using native Python data structures like dictionaries and lists.

The CustomDataFrame class supports multiple output formats to ensure compatibility with existing tests and use cases:
When calling methods like score_report, to_pandas, or comparative_individual_scores_report, the desired output format must be explicitly specified, or the default would be returned i.e. CustomDataFrame (dict). For example

How It Works

The CustomDataFrame internally uses dictionaries to store data, with keys representing column names and values representing column data as lists.

Output Methods:

to_dict: Returns the data as a dictionary.

to_csv_file_with_name: Saves the data to a CSV file or returns it as a string, if filename is not given, it returns the csv string.

to_pandas_dataframe: Converts the data to a pandas.DataFrame (optional, requires pandas).

How did you test it?

All functionalities of CustomDataFrame have been rigorously tested to ensure they pass the existing unit tests designed for pandas.DataFrame.

Notes for the reviewer

Create an EvaluationRunResult instance

result = EvaluationRunResult(run_name, inputs, results)

Get results as a dictionary

dict_output = result.score_report(output_format="customdataframe")

Get results as a CSV string

csv_output = result.score_report(output_format="csv", csv_file = 'score_report.csv')
If the file name int specified, it returns a csv string.

Get results as a pandas DataFrame (if pandas is installed)

pandas_output = result.score_report(output_format="pandas")

Checklist

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added unit tests and updated the docstrings
I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
I documented my code
I ran pre-commit hooks and fixed any issue

…eepset-ai#8812) * add component checks * pipeline should run deterministically * add FIFOQueue * add agent tests * add order dependent tests * run new tests * remove code that is not needed * test: intermediate from cycle outputs are available outside cycle * add tests for component checks (Claude) * adapt tests for component checks (o1 review) * chore: format * remove tests that aren't needed anymore * add _calculate_priority tests * revert accidental change in pyproject.toml * test format conversion * adapt to naming convention * chore: proper docstrings and type hints for PQ * format * add more unit tests * rm unneeded comments * test input consumption * lint * fix: docstrings * lint * format * format * fix license header * fix license header * add component run tests * fix: pass correct input format to tracing * fix types * format * format * types * add defaults from Socket instead of signature - otherwise components with dynamic inputs would fail * fix test names * still wait for optional inputs on greedy variadic sockets - mirrors previous behavior * fix format * wip: warn for ambiguous running order * wip: alternative warning * fix license header * make code more readable Co-authored-by: Amna Mubashar <[email protected]> * Introduce content tracing to a behavioral test * Fixing linting * Remove debug print statements * Fix tracer tests * remove print * test: test for component inputs * test: remove testing for run order * chore: update component checks from experimental * chore: update pipeline and base from experimental * refactor: remove unused method * refactor: remove unused method * refactor: outdated comment * refactor: inputs state is updated as side effect - to prepare for AsyncPipeline implementation * format * test: add file conversion test * format * fix: original implementation deepcopies outputs * lint * fix: from_dict was updated * fix: format * fix: test * test: add test for thread safety * remove unused imports * format * test: FIFOPriorityQueue * chore: add release note * feat: add AsyncPipeline * chore: Add release notes * fix: format * debug: switch run order to debug ubuntu and windows tests * fix: consider priorities of other components while waiting for DEFER * refactor: simplify code * fix: resolve merge conflict with mermaid changes * fix: format * fix: remove unused import * refactor: rename to avoid accidental conflicts * fix: track pipeline type * fix: and extend test * fix: format * style: sort alphabetically * Update test/core/pipeline/features/conftest.py Co-authored-by: Amna Mubashar <[email protected]> * Update test/core/pipeline/features/conftest.py Co-authored-by: Amna Mubashar <[email protected]> * Update releasenotes/notes/feat-async-pipeline-338856a142e1318c.yaml * fix: indentation, do not close loop * fix: use asyncio.run * fix: format --------- Co-authored-by: Amna Mubashar <[email protected]> Co-authored-by: David S. Batista <[email protected]>

…RunResult to work without pandas

CLAassistant · 2025-02-10T22:39:50Z

All committers have signed the CLA.

davidsbatista · 2025-02-12T14:06:17Z

@blue-gitty thanks a lot for this - I'm starting a review

davidsbatista · 2025-02-12T16:09:24Z

@blue-gitty thanks again for this - there are a few issues with the type checker and also some aspects regarding cleaning and improving a bit the code - do you mind if I take it from here?

blue-gitty · 2025-02-12T16:23:03Z

@davidsbatista Sure, please go ahead and take it from here. I realize the lint issues now, let me know if I could be of any help. Thanks!

coveralls · 2025-02-13T08:03:14Z

Pull Request Test Coverage Report for Build 13287209985

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

0 of 0 changed or added relevant lines in 0 files are covered.
48 unchanged lines in 2 files lost coverage.
Overall coverage decreased (-0.4%) to 91.126%

Files with Coverage Reduction	New Missed Lines	%
evaluation/base.py	10	0.0%
evaluation/eval_run_result.py	38	66.67%

Totals
Change from base Build 13284465135:	-0.4%
Covered Lines:	9416
Relevant Lines:	10333

💛 - Coveralls

mathislucka and others added 11 commits February 9, 2025 15:46

Merge branch 'dev' of https://github.com/blue-gitty/haystack

f8806b0

Merge branch 'dev' of https://github.com/blue-gitty/haystack into dev

1e04929

Merge branch 'dev' of https://github.com/blue-gitty/haystack

463fbcc

Merge branch 'dev' of https://github.com/blue-gitty/haystack into dev

b75d33b

updated changes for refactoring evaluations without pandas package

be4306f

Merge branch 'dev' of https://github.com/blue-gitty/haystack into dev

e250407

added release notes for eval_run_result.py for refactoring Evaluation…

0273012

…RunResult to work without pandas

Merge branch 'deepset-ai:main' into main

2aec75a

Merge branch 'dev' of https://github.com/blue-gitty/haystack

2dc7dd0

blue-gitty requested review from a team as code owners February 10, 2025 22:39

blue-gitty requested review from dfokina and mpangrazzi and removed request for a team February 10, 2025 22:39

github-actions bot added the type:documentation Improvements on the docs label Feb 10, 2025

Merge branch 'main' into main

6825305

davidsbatista self-requested a review February 12, 2025 14:05

davidsbatista and others added 2 commits February 12, 2025 15:06

Merge branch 'main' into main

20bf0ca

Merge branch 'main' into main

5be9921

davidsbatista added 3 commits February 13, 2025 22:06

wip: cleaning and refactoring

e8f9aff

removing BaseEvaluationRunResult

7388716

wip: fixing tests

ef927c5

github-actions bot added the topic:tests label Feb 13, 2025

Merge branch 'main' into main

df3453c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replacing Pandas in EvaluationRunResult with CustomDataFrame #8793 #8838

Replacing Pandas in EvaluationRunResult with CustomDataFrame #8793 #8838

blue-gitty commented Feb 10, 2025 •

edited

Loading

CLAassistant commented Feb 10, 2025 •

edited

Loading

davidsbatista commented Feb 12, 2025

davidsbatista commented Feb 12, 2025

blue-gitty commented Feb 12, 2025

coveralls commented Feb 13, 2025

Replacing Pandas in EvaluationRunResult with CustomDataFrame #8793 #8838

Are you sure you want to change the base?

Replacing Pandas in EvaluationRunResult with CustomDataFrame #8793 #8838

Conversation

blue-gitty commented Feb 10, 2025 • edited Loading

Related Issues

Proposed Changes:

How did you test it?

Notes for the reviewer

Create an EvaluationRunResult instance

Get results as a dictionary

Get results as a CSV string

Get results as a pandas DataFrame (if pandas is installed)

Checklist

CLAassistant commented Feb 10, 2025 • edited Loading

davidsbatista commented Feb 12, 2025

davidsbatista commented Feb 12, 2025

blue-gitty commented Feb 12, 2025

coveralls commented Feb 13, 2025

Pull Request Test Coverage Report for Build 13287209985

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

blue-gitty commented Feb 10, 2025 •

edited

Loading

CLAassistant commented Feb 10, 2025 •

edited

Loading