Tested with Python 2.7.3 and Tensorflow 1.1
It trains a global attentional NMT model based on Effective Approaches to Attention-based Neural Machine Translation using beam search with length normalization by Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation.
It trains with negative loglikelihood objective and periodically evaluates the performance (BLEU score) on dev set. It can save a few best models as defined in the configuration function. I always save one best model (highest BLEU score on the dev set).
- Put your data including
{train,dev,test}.{source,target}
into a folder, let's call its2t
, in thenmt/data
directory - Write corresponding configuration function in
configurations.py
, let's call its2t_config
- To train, run
python -m nmt --proto s2t_config
- To translate (with unk replacement) a file with saved model, run
python -m nmt --mode translate --unk-repl --proto s2t_config --model-file nmt/saved_models/your_saved_model_dir_name/your_model_name-best-bleu-score.cpkt --input-file path_to_file_to_translate
An example configuration function test_en2vi
is provided.
There's a reload
option. If this is set to True
, and if the code sees a corresponding checkpoint it will automatically start training from that checkpoint.
For example, if we want to not train the output embedding, we can access this from our model as model.softmax.logit_layer.W
, so we fix it this way:
python -m nmt --proto s2t --fixed-var-list softmax.logit_layer.W
I was inspired by code from lots of examples out there I can't start to remember. A lot of code is referenced from Blocks examples, DL4MT, and multi-bleu.perl
is taken from Moses.
We use this code for the paper Transfer Learning across Low-Resource, Related Languages for Neural Machine Translation which will appear at IJCNLP'17. We use the code at subword-nmt for processing BPE as explained in the paper.