Skip to content

This project involves the analysis and development of machine learning models to predict credit risk using historical lending data.

Notifications You must be signed in to change notification settings

nardyjh/credit-risk-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Credit Risk Classification

Overview

This project involves the analysis and development of machine learning models to predict credit risk using historical lending data. The goal is to build models that can classify loans as either healthy (class 0) or high-risk (class 1). The analysis includes data preprocessing, model training, and evaluation.

Project Structure

  • Credit_Risk/: Folder containing project files.
    • credit_risk_classification.ipynb: Jupyter Notebook with the main code for the project.
    • lending_data.csv: Dataset containing historical lending data.
    • final_report.md: Detailed report providing insights into the analysis and results.
  • README.md: Project documentation and overview.

Requirements

  • Python 3.x
  • Jupyter Notebook
  • Required Python libraries: numpy, pandas, scikit-learn, imbalanced-learn

Instructions

  1. Clone the repository:
git clone https://github.com/nardyjh/credit-risk-classification.git
  1. Open and run the Jupyter Notebook:
jupyter notebook Credit_Risk/credit_risk_classification.ipynb
  1. Follow the instructions in the notebook to execute the analysis.

Results

The project includes the development and comparison of two machine learning models.

  1. Original Logistic Regression Model:
  • Accuracy: 99.18%
  • Precision (Class 1): 85%
  • Recall (Class 1): 91%
  • F1-Score (Class 1): 88%
  1. Logistic Regression Model with Resampled Data:
  • Accuracy: 99.38%
  • Precision (Class 1): 84%
  • Recall (Class 1): 99%
  • F1-Score (Class 1): 91%

Final Report

For a more detailed report and insights, refer to the final_report.md file.

Conclusion

The logistic regression model with resampled data outperformed the original model, showcasing enhanced accuracy and precision, especially in predicting high-risk loans.

References

Data for this dataset was generated by edX Boot Camps LLC, and is intended for educational purposes only. University of Toronto.

About

This project involves the analysis and development of machine learning models to predict credit risk using historical lending data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published