HPE-Hackathon-Techgig

This repository contains the solution submitted in the final round of the HPE Machine Learning Hackathon organised by Techgig.

Objective:

The candidates have to read the data and create a model based on the data analysis to identify if the website is legitimate or a phishing website. The Result will be determined by the two values [1, -1] where 1 represent the legitimate and -1 represents phishing.

Data Description:

The data set consists of 30 features of a phishing website. The value of attributes can be [1, 0, -1] except the key value which is incremental.

[1] represents the legitimate
[0] represents suspicious
[-1] represents phishing

Features:

There are a total 32 columns present in the dataset with 8955 number of rows. Data columns are based on address bar, domain and HTML and JavaScript features.

Modeling and training:

In the starting phase we split the data into 60 - 40 i.e. 5373 training sample data and remaining i.e. 3582 testing sample data. From the dataset, From the objective and given dataset format we can identify it is a supervised task hence, there are two steps of supervised machine learning problems we have to follow which are called classification and regression.

The given objective comes under classification problems and according to the input URL is classified as phishing (-1) or legitimate (1). The supervised machine learning models (classification) considered to train the dataset in this project are:

XGBoost
Random Forest
Decision Tree
Multilayer Perceptrons
SVM

Approach:

Analyze and preprocess the dataset and do the basic EDA on the dataset for correlation plot.
Divide the data into train and test split dataset.
Run the selected models.
Do the performance evaluation.
Compare the performance of the models.
Save the top performing model and run it on sample test data.
Save the results.

Accuracy of the Models:

🌞Newsflash- My team of three ranked 27th, 28th and 29th respectively out of almost 8000 participants and we got felicitated by the Dean and Director of CHRIST (Deemed to be University) Pune Lavasa Campus🥳

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
assets		assets
data		data
Phishing model description.docx		Phishing model description.docx
Phishing model.ipynb		Phishing model.ipynb
Phishing model.pdf		Phishing model.pdf
Phising_Sample_Submisson.csv		Phising_Sample_Submisson.csv
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HPE-Hackathon-Techgig

Objective:

Data Description:

Features:

Modeling and training:

Approach:

Accuracy of the Models:

🌞Newsflash- My team of three ranked 27th, 28th and 29th respectively out of almost 8000 participants and we got felicitated by the Dean and Director of CHRIST (Deemed to be University) Pune Lavasa Campus🥳

About

Releases

Packages

Languages

jasleen101010/HPE-Hackathon-Techgig

Folders and files

Latest commit

History

Repository files navigation

HPE-Hackathon-Techgig

Objective:

Data Description:

Features:

Modeling and training:

Approach:

Accuracy of the Models:

🌞Newsflash- My team of three ranked 27th, 28th and 29th respectively out of almost 8000 participants and we got felicitated by the Dean and Director of CHRIST (Deemed to be University) Pune Lavasa Campus🥳

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages