This repository contains Python code for performing Exploratory Data Analysis (EDA) on the Iris dataset. The Iris dataset is a classic dataset in machine learning and statistics, consisting of 150 samples from each of three species of Iris flowers (Iris setosa, Iris versicolor, and Iris virginica). The dataset includes four features measured for each sample: the lengths and widths of the sepals and petals.
iris_exploration.ipynb
: Python Notebook script containing code for EDA.README.md
: Brief description of the EDA and instructions for running the code.
- Basic statistics, including mean, median, and standard deviation, were computed for each feature.
- Histograms, pair plots, box plots, and violin plots were used to visualize the distribution and relationships between features.
- Correlation analysis was performed to identify the relationships between features and their impact on species classification.
- The conclusion of the EDA suggests that petal measurements (length and width) are more significant for species classification than sepal measurements.
- Clone the repository:
git clone https://github.com/THAMIZH-ARASU/Exploratory-Data-Analysis-on-The-Iris-Data-Set.git
- Run the Python script on Jupyter notebook :
iris_exploration.py
Feel free to explore the code and visualizations to gain insights into the Iris dataset!