GluonNLP v0.3 contains many exciting new features.
(depends on MXNet 1.3.0b20180725)

Models

Language Models
- The Cache Language Model as introduced by Grave, E., et al. “Improving neural language models with a continuous cache”. ICLR 2017 is introduced as part of gluonnlp.model.train (#110)
- The Activation Regularizer and Temporal Activation Regularizer as introduced by Merity, S., et al. "Regularizing and optimizing LSTM language models". ICLR 2018 is introduced as part of gluonnlp.loss (#110)
Machine Translation
- The Transformer Model as introduced by Vaswani, Ashish, et al. "Attention is all you need." Advances in Neural Information Processing Systems. 2017* is introduced as part of the gluonnlp nmt scripts (#133)
Word embeddings
- Trainable word embedding models are introduced as part of gluonnlp.model.train (#136)
  - Word2Vec by Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111-3119).
  - FastText models by Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics, 5, 135-146.

New Datasets

Machine Translation
- WMT2014BPE (#135) (#177) (#180)
Question Answering
- Stanford Question Answering Dataset (SQuAD) Rajpurkar, P., Zhang, J., Lopyrev, K., & Liang, P. (2016). SQuAD: 100,000+ Questions for Machine Comprehension of Text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (pp. 2383-2392). (#113)
Word Embeddings
- Text8 (#136)

The download directory for datasets and other artifacts can now be specified
via the MXNET_HOME environment variable. (#106)
TokenEmbedding class now exposes the Inverse Vocab as well (#123)
SortedSampler now supports use_average_length option (#135)
Add more strategies for bucket creation (#145)
Add tokenizer to bleu (#154)
Add Convolutional Encoder and Highway Layer (#129) (#186)
Add plain text of translation data. (#158)
Use Sherlock Holmes dataset instead of PTB for language model notebook (#174)
Add classes JiebaToknizer and NLTKStanfordSegmenter for Chinese Word Segmentation (#164)
Allow toggling output and prompt in documentation website (#184)
Add shape assertion statements for better user experience to some attention cells (#201)
Add support for computation of word embeddings for unknown words in TokenEmbedding class (#185)
Distribute subword vectors for pretrained fastText embeddings enabling embeddings for unknown words (#185)

fixed bptt_batchify sometimes returned an invalid last batch (#120)
Fixed wrong PPL calculation in word language model script for multi-GPU (#150)
Fix split compound words and wmt16 results (#151)
Adapt pretrained word embeddings example notebook for nd.topk change in mxnet 1.3 (#153)
Fix beam search script (#175)
Fix small bugs in parser (#183)
TokenEmbedding: Skip lines with invalid bytes instead of crashing (#188)
Fix overly large memory use in TokenEmbedding serialization/deserialization if some tokens are overly large (eg. 50k characters) (#187)
Remove duplicates in WordSim353 when combining segments (#192)