English Corpus Text Visualization using Word2Vec Model

Machine Learning approach to English Corpus Text-visualization using Word2Vec Model from Gensim Library in NLP. This project was done to test the accuracy of the Word2Vec Model on English Corpus.

Library requirements:

Sklearn: Used for data preprocessing, model selection, classification, Regression, clustering.
Matplotlib: It's used for 2D or 3D plotting to show Histogram, Bar-Chart etc
Gensim: Open Source Library used in Text Analysis, Word2Vec, Doc2Vec.
Used Melon Honey font & sample texts are collected from the Internet.

Word2Vec

Word2Vec model is used in word embedding. I have used here Gensim library & Matplotlib-pyplot for 2d visualization of corpus.

Methodology:

First I took an English Corpus applied punctuation remover.
Splitted the data & visualized the corpus using.
Repeated the Process taking larger corpus.

Tools:

Google Colab/Jupyter Notebook
Language: Python
Word2Vec from Gensim
Matplotlib | Plyplot

Mentor

Prof. Sandipan Ganguly, HIT-K.

Developer

Rajdeep Das

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

English Corpus Text Visualization using Word2Vec Model

Library requirements:

Word2Vec

Methodology:

Tools:

Mentor

Developer

Thank you

Files

README.md

Latest commit

History

README.md

File metadata and controls

English Corpus Text Visualization using Word2Vec Model

Library requirements:

Word2Vec

Methodology:

Tools:

Mentor

Developer

Thank you