-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to restore a model? #4
Comments
Note that the model_reader also loads the word2index mapping, which is essential for applying the model. |
This is where I modified.
It seems that the |
This seems ok. The only thing is that the model_reader doesn't bother to initialize the loss_func with the correct values, because it's currently not supported in train mode. If your purpose it to further train a model that you load, then you should make sure you initialize the model's loss_func correctly with the true cs values. |
I am a little confused. Do cs values change after each epoch? What does cs stand for?
I can use |
Just to make sure, could you please describe what your end-goal here is? Are you trying to load one of our existing models and continue training it for more epochs? Using which corpus? |
My goal is to train a ukwac model like yours, with different parameters. I have run for one epoch, for some reason, I had to stop. Now I want to load the model to continue. |
Ok. So as long as you are using the exact same corpus that you used in the first epoch, then your code should work fine (since reader.word2index would be identical to the one used in the first epoch). And yes, there's no need to load the word embedding targets. |
With more epoches, the loss begin to increase, did this occur to you? And the accuracy of WSD became lower. |
As you could see from the code, I never continued training of an existing model. In the case of UkWac, I trained for one epoch, then later I trained for 3 epochs from scratch, and the performance of the latter model was better. I wouldn't expect the train loss to increase in your case, but maybe there's something I'm missing. One thing that does come to mind is that to do this properly you should also save (and later restore) the Adam optimizer state along with the model. |
Is it ok if I add
S.load_npz(model_file, model)
aftermodel = BiLstmContext(args.deep, args.gpu, reader.word2index, context_word_units, lstm_hidden_units, target_word_units, loss_func, True, args.dropout)
in train_context2vec.py without using common.model_reader?Thank you very much.
The text was updated successfully, but these errors were encountered: