1: Run the code in Jupyter Notebook or Google Colab.
2: Ensure that all required modules are installed in your system before running the code. The following modules should be installed: os requests nltk pandas numpy re string random sklearn collections xgboost mlxtend matplotlib wordcloud multidict
If you plan to run the BERT model, also install
transformers
torch
tqdm
3: Run the code in sequential order, particularly the sections leading up to the dataset split into training and testing.
4: Optionally skip certain code chunks, such as wordcloud and frequency distribution charts, or inspecting the structure of selected data and models that are not of interest to you.