Starred repositories
Approximate Nearest Neighbor Search for Sparse Data in Python!
A game theoretic approach to explain the output of any machine learning model.
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
Text preprocessing, representation and visualization from zero to hero.
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
XLNet: Generalized Autoregressive Pretraining for Language Understanding
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
a pyenv plugin to manage virtualenv (a.k.a. python-virtualenv)
Line bot that checks if a message contains internet rumor.
GraphQL API server for clients like rumors-site and rumors-line-bot
High level Python client for Elasticsearch
This repository stores slides for a tutorial on variational inference for NLP audiences.
A TensorFlow Implementation of the Transformer: Attention Is All You Need
BERT with SentencePiece for Japanese text.
TensorFlow code and pre-trained models for BERT
🤔 Search & Replace unicode emojis. Supports Unicode 10
TensorFlow tutorials and best practices.
Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings
Pre-trained word vectors of 30+ languages
Tensorflow implementation of contextualized word representations from bi-directional language models
Unsupervised text tokenizer for Neural Network-based text generation.
Solutions to LeetCode problems; updated daily. Subscribe to my YouTube channel for more.
Alphabetical list of free/public domain datasets with text data for use in Natural Language Processing (NLP)