This project uses machine learning to predict whether a student will pass or fail based on various factors such as attendance, study habits, and parental support. A Random Forest Classifier is used to perform the classification, and evaluation is done using confusion matrix, accuracy, precision, and recall.
The dataset includes features like:
- Absences
- Weekly Study Time
- Tutoring
- Parental Support
- Extracurricular Activities
- Sports, Music, Volunteering
- Parental Education
- Age
- GPA (used to derive the target label)
Target variable: Pass (1 = GPA ≥ 2.0, 0 = GPA < 2.0)
-
Data Preprocessing
- Created a binary target column (
Pass
) based on GPA. - Selected relevant features for prediction.
- Split data into training (80%) and testing (20%) sets.
- Created a binary target column (
-
Model
- Used a Random Forest Classifier from scikit-learn.
- Evaluated using confusion matrix, accuracy, precision, and recall.
- Upload the dataset
- Install required libraries:
pip install pandas seaborn scikit-learn matplotlib