Neural Machine Translation by Jointly Learning to Align and Translate paper implementation https://arxiv.org/abs/1409.0473v7
Try installing using the install.bat other wise follow the general proceedure for conda, but same works for pyenv venv etc.
conda create -n mlaproject python=3.10 -y
conda activate mlaproject
pip install --upgrade -r requirements.txt
pip install --upgrade -e ./src
pytest
python
import torch
torch.cuda.is_available()
The last command should return True if you have a GPU available.
pip install --upgrade -r requirements.txt
pip install -e ./src
you can simply test the code using the dummy example provided in main.ipynb for training and testing
git stash
git pull origin main
git stash pop
git checkout <branch_name>
git add .
git commit -m "message"
git push origin <branch_name>
git checkout main
git pull origin main
git checkout <branch_name>
git merge main
git push origin <branch_name>
Then go to github and create a pull request
The translation model is a sequence-to-sequence (Seq2Seq) model with an encoder-decoder architecture. This type of model is commonly used for tasks that involve sequence prediction, such as language translation.
-
Encoder: The encoder's job is to understand the input sequence and compress that understanding into a context vector, which is a fixed-length vector representation of the input sequence.
-
Decoder: The decoder takes the context vector from the encoder and generates the output sequence.
-
idx_to_word: This method takes a tensor of indices and a vocabulary list, and converts the indices back into words. It uses the MosesDetokenizer to convert the list of tokens back into a sentence. The language parameter is used to specify the language for the detokenizer.
-
beam_search: This method performs beam search on the input tensor. Beam search is a search algorithm that is used in Seq2Seq models to improve the quality of the output sequences. It maintains a set (or "beam") of the most promising sequences at each step, and extends these sequences at the next step. This helps to avoid the problem of "search errors", where the model selects a suboptimal sequence early on and is unable to recover.
The model takes a batch of source sentences, and for each sentence, it generates a prediction and a translation. The source sentences, predictions, and translations are all converted back into words using the idx_to_word
method, and are returned as lists of sentences.
The beam_search
method is used to generate the predictions. It takes the logits from the decoder, and returns the indices of the most probable sequences.
The model uses the PyTorch library for tensor operations and model training.