A Generative word-level chatbot with PyTorch trained on Microsoft's MetaLWOz data, hacked in a few days.
Code is documented along with really illustrative comments in the pytorch-seq2seq-chatbook.ipynb
notebook.
You can view and run it step-by-step. Care, training time depends on the size of your data and your neural net/hardware configuration.
- Python 3
- PyTorch
- ...and a really expensive GPU
Before PyTorch, I also tried a char-level Keras model and a word-level TensorFlow model on the unfiltered dataset, which struggled a lot with memory usage causing multiple out-of-memory (OOM) problems, even on a Tesla K80 GPU.
You can find them on the appropriate different-versions
folder.
Some workaround ideas:
- removing sentences with a high number of words
- Trying different amounts of hidden dimensions, units, etc
- Importing pre-trained embeddings like GloVe
- Use SOTA Transformers like BERT
- Use early stopping on a validation set
- Add chit-chat dialogues to the dataset