You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, metric.measure() only returns the score, while evaluate() returns both score and reason. Since measure() internally calls evaluate() and stores the reason in self.reason, it would be more convenient if measure() returned both values directly.
This would be particularly useful when evaluating in batches, as it avoids the need to call metric.reason separately for each instance. Instead of handling post-processing in multiple steps, it would be cleaner and more efficient to get both score and reason as a return value from measure().
Proposed Solution
Modify metric.measure() to return a tuple (score, reason), similar to what evaluate() provides, rather than requiring users to retrieve metric.reason separately.
This would make the API more intuitive and reduce extra steps in evaluation pipelines.
Would love to hear your thoughts on this.
The text was updated successfully, but these errors were encountered:
Mujae
changed the title
Feature Request: Return both score and reason from metric.measure()
Feature Request: Return both score and reason from G-evalmetric.measure()
Feb 8, 2025
Mujae
changed the title
Feature Request: Return both score and reason from G-evalmetric.measure()
Feature Request: Return both score and reason from metric.measure()
Feb 8, 2025
Hey @Mujae thanks for the suggestion! The think with reason is it is not required if include_reason is False. For those that need both score and reason they can just do metric.score and metric.reason.
Thank you for your response!
Yes, that's correct. However, in the case of G-eval, it always returns a reason, so there is no include_reason option. Given that, if measure itself returns the reason, there would be no need to access metric.reason separately, which is why I made the suggestion.
Description
Currently, metric.measure() only returns the score, while evaluate() returns both score and reason. Since measure() internally calls evaluate() and stores the reason in self.reason, it would be more convenient if measure() returned both values directly.
This would be particularly useful when evaluating in batches, as it avoids the need to call metric.reason separately for each instance. Instead of handling post-processing in multiple steps, it would be cleaner and more efficient to get both score and reason as a return value from measure().
Proposed Solution
Modify metric.measure() to return a tuple (score, reason), similar to what evaluate() provides, rather than requiring users to retrieve metric.reason separately.
This would make the API more intuitive and reduce extra steps in evaluation pipelines.
Would love to hear your thoughts on this.
The text was updated successfully, but these errors were encountered: