Is there any plan to support `reason` in LLMMetrics and EvaluationResult? #1813

parkerzf · 2025-01-06T11:49:09Z

[ * ] I checked the documentation and related resources and couldn't find an answer to my question.

Your Question
For evaluation explainability, it is very valuable to also include the reason into the result so that we understand better how LLM metrics making decisions and improve the metrics accordingly.

Is there any plan to support it? Thanks!

Code Examples
NA

Additional context
Anything else you want to share with us?

jjmachan · 2025-01-07T10:20:59Z

@parkerzf you are right and we currently do have something like https://docs.ragas.io/en/stable/howtos/applications/_metrics_llm_calls to help with it. Can you check and see if it works for your usecase?

parkerzf · 2025-01-07T14:01:24Z

Hey @jjmachan Thanks for the reply! I think it is very close to what I am looking for.

Ideally, I would like to use it as following:

from datasets import load_dataset
from ragas import EvaluationDataset
from ragas import evaluate
from ragas.metrics._aspect_critic import harmfulness

dataset = load_dataset("explodinggradients/amnesty_qa", "english_v3")


eval_dataset = EvaluationDataset.from_hf_dataset(dataset["eval"])

results = evaluate(eval_dataset[:5], metrics=[harmfulness])
results.to_pandas(including_trace=True)

The output dataframe contains the following columns:
["user_input", "response", "harmfulness", "harmfulness_reason"]

What is the ETA of this new feature? I would like to try it out.

github-actions · 2025-01-22T04:48:07Z

Closing after 8 days of waiting for the additional info requested.

jjmachan · 2025-01-22T04:59:29Z

hey @parkerzf I'm don't think we will add it to the pandas dataframe, it will be redundant and I don't thing we can show multiple steps here - like for faithfulness or metrics where LLM is just one part of the process - like context recall

what we can do this for is aspect critic, since the verdict and reason are the only fields it shows. But in order to do that you can write a simple function. I can write it up for you if you want

parkerzf added the question Further information is requested label Jan 6, 2025

sahusiddharth added the waiting 🤖 waiting for response. In none will close this automatically label Jan 22, 2025

github-actions bot closed this as completed Jan 22, 2025

github-actions bot removed the waiting 🤖 waiting for response. In none will close this automatically label Jan 22, 2025

jjmachan reopened this Jan 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there any plan to support `reason` in LLMMetrics and EvaluationResult? #1813

Is there any plan to support `reason` in LLMMetrics and EvaluationResult? #1813

parkerzf commented Jan 6, 2025

jjmachan commented Jan 7, 2025 •

edited by sahusiddharth

Loading

parkerzf commented Jan 7, 2025 •

edited

Loading

github-actions bot commented Jan 22, 2025

jjmachan commented Jan 22, 2025

Is there any plan to support reason in LLMMetrics and EvaluationResult? #1813

Is there any plan to support reason in LLMMetrics and EvaluationResult? #1813

Comments

parkerzf commented Jan 6, 2025

jjmachan commented Jan 7, 2025 • edited by sahusiddharth Loading

parkerzf commented Jan 7, 2025 • edited Loading

github-actions bot commented Jan 22, 2025

jjmachan commented Jan 22, 2025

Is there any plan to support `reason` in LLMMetrics and EvaluationResult? #1813

Is there any plan to support `reason` in LLMMetrics and EvaluationResult? #1813

jjmachan commented Jan 7, 2025 •

edited by sahusiddharth

Loading

parkerzf commented Jan 7, 2025 •

edited

Loading