This repository is for record my studies and make enable study anywhere for myself.
These studies are based on [유원준/안상준, "딥 러닝을 이용한 자연어 처리 입문", 2022] plus [Saito Goki, "Deep Learning from Scratch"]. but is not same in all part. And started from 22.09.27 .
Siwon Seo, Tokai-university
- Tokenization
- Cleaning & Normalization
- Stemming & Lemmatization
- Stopword
- Integere Encoding
- Padding
- One-Hot Encoding
- Splitting Data
- Statistical Language Model
- N-gram Language Model
- Perplexity (PPL)
- Bag of Words (BoW)
- Document-Term Matrix (DTM)
- TF-IDF (Term Frequency-Inverse Document Frequency)
- Cosine Similarity
- Euclidean distance, Jaccard similarity
- Linear Regression
- Automatic differentiation
- Logistic Regression
- Softmax Regression
- etc..
- Perceptron
- Feed-Forward Neural Network (FFNN)
- Fully-connected layer (FC)
- Activation Function
- Loss function
- Batch Size
- Optimizer
- Back-Propagation
- Overfitting
- Gradient Vanishing & Exploding
- RNN
- LSTM
- GRU
- BiLSTM (BiGRU)
- Sparse Representation
- Dense Representation & Word Embedding
- Word2Vec
- GloVe
- FastText
- ELMo
- CNN
- 1D CNN
- Character Embedding
- Part of speech Tagging
- Named Entity Recognition (NER)
- BIO
====================================
- Byte Pair Encoding (BPE)
- Sentence Piece
- Subword Text Encoder