This repository has been archived by the owner on Jan 15, 2024. It is now read-only.
v0.5.0
eric-haibin-lin
released this
27 Nov 06:52
·
539 commits
to master
since this release
Highlights
- Featured in AWS re:invent 2018
Models
-
BERT
- The Bidirectional Encoder Representations from Transformers model as introduced by Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018) (#409).
Model parameters are converted from the original model checkpoints from Google research, including:- BERT BASE model trained on
- Book Corpus & English Wikipedia (cased)
- Book Corpus & English Wikipedia (uncased)
- multilingual Wikipedia (uncased)
- BERT LARGE model trained on Book Corpus & English Wikipedia (uncased)
- BERT BASE model trained on
- The Bidirectional Encoder Representations from Transformers model as introduced by Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018) (#409).
-
ELMo
- The Embeddings from Language Models as introduced by Peters, Matthew E., et al. "Deep contextualized word representations." arXiv preprint arXiv:1802.05365 (2018) (#227, #428).
Model parameters are converted from the original model checkpoints in AllenNLP, including the small, medium, original models trained on 1 billion words dataset, and the original model trained on 5.5B tokens consisting of Wikipedia & monolingual news crawl data from WMT 2008-2012.
- The Embeddings from Language Models as introduced by Peters, Matthew E., et al. "Deep contextualized word representations." arXiv preprint arXiv:1802.05365 (2018) (#227, #428).
-
Word Embedding
- The GloVe model as introduced by Pennington, Jeffrey, Richard Socher, and Christopher Manning. "Glove: Global vectors for word representation." Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014 (#359).
-
Natural Language Inference
- The Decomposable Attention Model as introduced by Parikh, Ankur P., et al. "A decomposable attention model for natural language inference." arXiv preprint arXiv:1606.01933 (2016). (#404). On the SNLI test set, it achieves 84.6% accuracy (without intra-sentence attention) and 84.4% accuracy (with intra-sentence attention). Thank you @linmx0130 @hhexiy!
-
Dependency Parsing
- The Deep Biaffine Attention Dependency Parser as introduced by Dozat, Timothy, and Christopher D. Manning. "Deep biaffine attention for neural dependency parsing." arXiv preprint arXiv:1611.01734 (2016). (#408). It achieved 96% UAS on the Penn Treebank dataset. Thank you @hankcs!
-
Text Classification
- The Text CNN model as introduced by Kim, Yoon. "Convolutional neural networks for sentence classification." arXiv preprint arXiv:1408.5882 (2014). (#391) Thank you @xiaotinghe!
New Tutorials
- ELMo
- A tutorial on generating contextualized representation with the pre-trained ELMo model, as introduced by Peters, Matthew E., et al. "Deep contextualized word representations." arXiv preprint arXiv:1802.05365 (2018) (#227, #428).
- BERT
- A tutorial on fine-tuning the BERT model for sentence pair classification, as introduced by Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018) (#437)
New Datasets
- Sentiment Analysis
- MR, a movie-review data set of 10,662 sentences labeled with respect to their overall sentiment polarity (positive or negative). (#391)
- SST_1, an extension of the MR data set with fine-grained labels (#391)
- SST_2, an extension of the MR data set with binary sentiment polarity labels (#391)
- SUBJ, a subjectivity data set for sentiment analysis (#391)
- TREC, a movie-review data set of 10,000 sentences labeled with respect to their subjectivity status (subjective or objective). (#391)
API Updates
- Changed Vocab constructor from staticmethod to classmethod to handle inheritance (#386)
- Added Transformer Encoder APIs (#409)
- Added pre-trained ELMo model to model.get_model API (#227)
- Added pre-trained BERT model to model.get_model API (#409)
- Added unknown_lookup setter to TokenEmbedding (#429)
- Added dtype support to EmbeddingCenterContextBatchify (#416)
- Propagated exceptions from PrefetchingStream (#406)
- Added sentencepiece tokenizer detokenizer (#380)
- Added CSR format for variable length data in embedding training (#384)
Fixes & Small Changes
- Included output of nlp.embedding.list_sources() in API docs (#421)
- Supported symlinks in examples and scripts (#403)
- Fixed weight tying in GNMT and Transformer (#413)
- Simplified transformer notebook (#400)
- Fixed LazyTransformDataStream prefetching (#397)
- Adopted src/gluonnlp folder layout (#390)
- Fixed text8 archive file name for downloads from S3 (#388) Thanks @bkktimber!
- Fixed ppl reporting for training on multi gpu in the language model notebook (#365). Thanks @ThomasDelteil!
- Fixed a spelling mistake in QA script. (#379) Thanks @qyhfbqz!