Skip to content

Latest commit

 

History

History
15 lines (15 loc) · 1.11 KB

File metadata and controls

15 lines (15 loc) · 1.11 KB

Detect malicious URLs using machine learning models

Environment tips:

This project runs under python3.11. When you install lightgbm on macOS, there will be a problem as you need gcc to complie the package. Here is the instruction to install lightgbm on your macOS.

Description:

  1. features_extraction.py is used to extract 31 features, including general features, length features, count features, ratio features and domain features as shown in the features table.
  2. model training.py is used to train different ML models and draw the heapmap.

The datasets we collected from:

  1. Kaggle
  2. UNB
  3. URLhaus
  4. Mendeley

The applied models are:

Logistic, KNN, SVM, Decision Trees, Random Forest, Bagging, and AdaBoosting

Contributors:

xinyanzhang27