Skip to content

Classes to explore various text embeddings and learning models

Notifications You must be signed in to change notification settings

polinabee/text-mining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

text-mining

Classes to explore various text embeddings and learning models

Vectorizer:

  • custom / from-scratch TF-IDF matrix encoding
  • custom weights to improve matching

Text Classifier:

  • logistic regression on TFIDF with custom tokenization
  • comparison of training/testing on two different datasets -- seeing how well it generalizes
  • testing model on custom vocabulary created from intersection of the two datasets

Continuous Embeddings

  • comparison of logistic regression on basic TFIDF embeddings to doc2vec embeddings
  • seeing how well model generalizes over different test-train splits

About

Classes to explore various text embeddings and learning models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages