You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I'm attempting to apply the xmoverscore metric to a novel dataset.
I ran main.py, and it generated the following files, which I organized into results directories.
Are the person correlation coefficients reported in these files? I seem to be unable to find them.
Also, are the sample-level xmoverscore values reported in the HUMAN columns of DA-seglevel.csv file?
I calculated sample-level xmoverscores on the novel dataset I'm working with and the scores seem to be roughly in the range [-0.1, -0.2]
Does that seem like a valid range for the scores, or is it likely there is an error in the way I am calculating the scores?
Thank you.
The text was updated successfully, but these errors were encountered:
Are the Pearson correlation coefficients reported in these files?
No, these would be reported in console.
are the sample-level xmoverscore values reported in the HUMAN columns of DA-seglevel.csv file?
No, the HUMAN column shows human judgments of translation quality normalized by z-score, done by WMT workshops.
does that seem like a valid range for the scores?
The xmoverscore metric produces scores less than (or equal to) 1. Below are the details:
xmoverscore = 1 - EMD (earth mover distance). EMD produces positive scores or zeros. I have normalized the metric scores into the interval of [0, 1], with 1 as a perfect score. See 2026e18
Hello, I'm attempting to apply the xmoverscore metric to a novel dataset.
I ran
main.py
, and it generated the following files, which I organized into results directories.Are the person correlation coefficients reported in these files? I seem to be unable to find them.
Also, are the sample-level xmoverscore values reported in the
HUMAN
columns ofDA-seglevel.csv
file?I calculated sample-level xmoverscores on the novel dataset I'm working with and the scores seem to be roughly in the range [-0.1, -0.2]
Does that seem like a valid range for the scores, or is it likely there is an error in the way I am calculating the scores?
Thank you.
The text was updated successfully, but these errors were encountered: