You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently trying to replicate the results in the paper, but my numbers seem to be off. Is there any hyperparameter difference from the defaults in code?
Here is the line I trained the model with:
python3 HiCE-master/train.py --cuda 0 --use_morph --adapt --w2v_dir "herbelot_and_baroni_w2v/wiki_all.sent.split.model" --corpus_dir "HiCE-master/data/wikitext-103/" --save_dir "/my_directory/" --chimera_dir "HiCE-master/data/chimeras/"
While the Upperbound seems to be correct, both the Baseline:Additive and HiCE seem to be off (HiCE less so, but it is). Do you have any insight as to why? Here are my results:
For additive methods, the results shown in the paper are directly got from the original paper. But yes the experimental results are lower.
For HiCE, the results shown in table 1 are got by fixing the K-shot number. For example, always choose K=2 to get the results for a 2-shot setting. I thought that enumerating different K can make the model more general, but the performance is actually lower. You can try fixing the K and see the results.
Is there any intuition as to why fixing the K-shot number has an impact? Is there any strategies to mitigate this and make it more robust to different context sizes?
Hi,
I am currently trying to replicate the results in the paper, but my numbers seem to be off. Is there any hyperparameter difference from the defaults in code?
Here is the line I trained the model with:
python3 HiCE-master/train.py --cuda 0 --use_morph --adapt --w2v_dir "herbelot_and_baroni_w2v/wiki_all.sent.split.model" --corpus_dir "HiCE-master/data/wikitext-103/" --save_dir "/my_directory/" --chimera_dir "HiCE-master/data/chimeras/"
While the Upperbound seems to be correct, both the Baseline:Additive and HiCE seem to be off (HiCE less so, but it is). Do you have any insight as to why? Here are my results:
Baseline: Additive
0.2871964621624245
0.30641486173408883
0.30354669461624095
Upper Bound: Ground Truth Embedding
0.41732506349077386
0.4366845012259888
0.4409890833288333
Test with 2 shot: Cosine: 0.4728; Spearman: 0.3509
Test with 4 shot: Cosine: 0.5198; Spearman: 0.3842
Test with 6 shot: Cosine: 0.5483; Spearman: 0.4007
The text was updated successfully, but these errors were encountered: