Skip to content

An exploratory data analysis and machine learning project using the Iris dataset to classify flower species with a K-Nearest Neighbors classifier. It includes data visualization, feature scaling, model training, and evaluation with 100% accuracy on the test set.

License

Notifications You must be signed in to change notification settings

ascender1729/iris-flower-classification-2024

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 

Repository files navigation

Iris Flower Classification Analysis

The Iris Flower Classification Analysis is a comprehensive machine learning project that leverages the Iris dataset along with its Bezdek's variant to predict Iris species using the K-Nearest Neighbors (KNN) algorithm. This project includes enhanced data handling, visualization, and model evaluation techniques.

Table of Contents

Project Overview

This project offers a detailed exploration and analysis of the Iris flower dataset, including data integrity checks, feature scaling, and dimensionality reduction through PCA to optimize classification performance. Enhanced visualization techniques aid in understanding the intricate relationships within the data.

Features

  • Data Integration: Utilizes Google Colab for seamless integration and data manipulation.
  • Dual Dataset Analysis: Analysis includes both the original and Bezdek's Iris datasets to ensure robustness.
  • Advanced Data Handling: Includes detection and removal of duplicate entries.
  • Feature Scaling and PCA: Implements StandardScaler for normalization and PCA for reducing dimensionality.
  • Enhanced Visualization: Uses Seaborn and Matplotlib to visualize data in reduced dimensions.
  • Precision Modeling: Applies a KNN model with optimized parameters for superior prediction accuracy.
  • Model Evaluation: Assesses the model's accuracy through advanced metrics.

Data Description

Two Iris datasets are utilized, each containing 150 samples of Iris flowers with features:

  • Sepal Length
  • Sepal Width
  • Petal Length
  • Petal Width
  • Species (Iris-setosa, Iris-versicolor, Iris-virginica)

Installation

Setup for Google Colab:

from google.colab import drive
drive.mount('/content/drive')

Clone the repository and navigate to the project directory:

git clone https://github.com/ascender1729/iris-flower-classification-2024.git
cd iris-flower-classification-2024

Usage

Install the required libraries:

pip install pandas numpy seaborn matplotlib scikit-learn

Run the Jupyter notebook via Google Colab for a comprehensive walkthrough.

Contributing

Contributions are welcome to extend the analysis or improve the existing methodologies.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

Pavan Kumar - [email protected]

LinkedIn: @ascender1729

Project Link: iris-flower-classification-2024

About

An exploratory data analysis and machine learning project using the Iris dataset to classify flower species with a K-Nearest Neighbors classifier. It includes data visualization, feature scaling, model training, and evaluation with 100% accuracy on the test set.

Topics

Resources

License

Stars

Watchers

Forks