Credit Approval Analysis

This project demonstrates a detailed analysis of credit approval data using machine learning models and data visualization techniques.

Introduction

This project analyzes a dataset of credit approvals, leveraging machine learning models to predict credit approval outcomes. The models used include:

Decision Tree Classifier (with hyperparameter tuning via GridSearchCV)
Logistic Regression (for comparison)

The dataset was preprocessed to handle missing values and categorical variables. Feature importance and correlations were visualized to gain insights into the data.

Technologies Used

Python 3.x
Libraries:
- numpy
- pandas
- matplotlib
- seaborn
- scikit-learn
- json

Setup and Installation

This project was developed using Google Colab. To replicate the analysis:

Open Google Colab.
Upload the script file (Credit_Approval.ipynb) to your Colab environment.
Upload the dataset (crx.data) to the Colab environment.
Run the script cell by cell.

No additional installation is required since Google Colab comes with most dependencies pre-installed. If any libraries are missing, install them using:

!pip install library_name

Data Preprocessing

The dataset contains both numeric and categorical features. Missing values were handled as follows:

Numeric columns: Replaced with the mean value.
Categorical columns: Replaced with the most frequent value and encoded using LabelEncoder.

Features were scaled using StandardScaler for better model performance.

Model Training and Evaluation

Decision Tree Classifier
- Hyperparameter tuning with GridSearchCV:
  - max_depth
  - min_samples_split
  - min_samples_leaf
- Achieved an accuracy of X.XXXX.
Logistic Regression
- Comparison model.
- Achieved an accuracy of X.XXXX.

Evaluation metrics included:

Accuracy
Classification Report
Confusion Matrix

Results and Visualization

Decision Tree Visualization
- A graphical representation of the optimized decision tree is provided.
Feature Importance
- Features were ranked based on their importance in the decision tree model.
Correlation Matrix
- Highlighted relationships between features.
Histograms
- Showed the distribution of numeric features, categorized by the target class.

Saving Results

Key results were saved to a JSON file (resultados_analise_credito.json) for easy sharing and further analysis. These include:

Model accuracies
Best hyperparameters for the Decision Tree
Feature importances

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Credit_Approval.ipynb		Credit_Approval.ipynb
README.md		README.md
crx.data		crx.data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Credit Approval Analysis

Table of Contents

Introduction

Technologies Used

Setup and Installation

Data Preprocessing

Model Training and Evaluation

Results and Visualization

Saving Results

About

Releases

Packages

Languages

enoquerogerio/Credit_Approval_Analysis

Folders and files

Latest commit

History

Repository files navigation

Credit Approval Analysis

Table of Contents

Introduction

Technologies Used

Setup and Installation

Data Preprocessing

Model Training and Evaluation

Results and Visualization

Saving Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages