Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replicating Your Results #2

Open
rajicon opened this issue Feb 21, 2020 · 2 comments
Open

Replicating Your Results #2

rajicon opened this issue Feb 21, 2020 · 2 comments

Comments

@rajicon
Copy link

rajicon commented Feb 21, 2020

Hi,

I am currently trying to replicate the results in the paper, but my numbers seem to be off. Is there any hyperparameter difference from the defaults in code?

Here is the line I trained the model with:
python3 HiCE-master/train.py --cuda 0 --use_morph --adapt --w2v_dir "herbelot_and_baroni_w2v/wiki_all.sent.split.model" --corpus_dir "HiCE-master/data/wikitext-103/" --save_dir "/my_directory/" --chimera_dir "HiCE-master/data/chimeras/"

While the Upperbound seems to be correct, both the Baseline:Additive and HiCE seem to be off (HiCE less so, but it is). Do you have any insight as to why? Here are my results:

Baseline: Additive
0.2871964621624245
0.30641486173408883
0.30354669461624095

Upper Bound: Ground Truth Embedding
0.41732506349077386
0.4366845012259888
0.4409890833288333

Test with 2 shot: Cosine: 0.4728; Spearman: 0.3509

Test with 4 shot: Cosine: 0.5198; Spearman: 0.3842

Test with 6 shot: Cosine: 0.5483; Spearman: 0.4007

@acbull
Copy link
Owner

acbull commented Feb 27, 2020

Hi rajicon:

For additive methods, the results shown in the paper are directly got from the original paper. But yes the experimental results are lower.

For HiCE, the results shown in table 1 are got by fixing the K-shot number. For example, always choose K=2 to get the results for a 2-shot setting. I thought that enumerating different K can make the model more general, but the performance is actually lower. You can try fixing the K and see the results.

Thanks

@rajicon
Copy link
Author

rajicon commented Sep 11, 2020

Hi,

Is there any intuition as to why fixing the K-shot number has an impact? Is there any strategies to mitigate this and make it more robust to different context sizes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants