Trump Or Trudeau?

This is a simple project that based on the tweet-datasets of Donald Trump and Justin Trudeau predicts the real owner or author of the tweet.

This Jupyter Notebook uses the following libraries/functionalitites :-

sklearn.feature_exrtaction.text CountVectorizer and TfidfVectorizer
sklearn.model_selection train_test_split
sklearn.naive_bayes MultinomialNB
sklearn.svm LinearSVC
sklearn metrics

Since the data has been collected via the Twitter API and not split into test and training sets. We use test_train_split() with random_state=53 and a test size of 0.33. This will ensure we have enough test data and we'll get the same results no matter where or when we run this code.

After that, we need to create vectorized representations of the tweets in order to apply machine learning.To do so, we will utilize the CountVectorizer and TfidfVectorizer classes which we will first need to fit to the data.

Now that we have the data in vectorized form, we can train the first model. We will be using the Multinomial Naive Bayes model with both the CountVectorizer and TfidfVectorizer data.

We see that the TF-IDF model performs better than the count-based approach

A better evaluation can be made if we look at the confusion matrix, which shows the number correct and incorrect classifications based on each class. We can use the metrics, True Positives, False Positives, False Negatives, and True Negatives, to determine how well the model performed on a given class.

The LinearSVC model is even better than the Multinomial Bayesian one. Via the confusion matrix we can see that, although there is still some confusion where Trudeau's tweets are classified as Trump's, the False Positive rate is better than the previous model.

Using the LinearSVC Classifier with two classes (Trump and Trudeau) we can sort the features (tokens), by their weight and see the most important tokens for both Trump and Trudeau.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
datasets		datasets
README.md		README.md
upd_notebook.ipynb		upd_notebook.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Trump Or Trudeau?

About

Releases

Packages

Languages

memeghaj10/Trump_Or_Trudeau-

Folders and files

Latest commit

History

Repository files navigation

Trump Or Trudeau?

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages