Skip to content

First stage project at Udacity on the 'Intro to Machine Learning with TensorFlow' program using sckit-learn in python

Notifications You must be signed in to change notification settings

BaraSedih11/finding_donors

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Finding_Donors

GitHub repo size GitHub repo file count (file type) Python Version Pip Version GitHub last commit (branch) Version Contributors GitHub pull requests

This repository contains a training and prediction model, along with tuning and testing, to identify the best estimators and features for our dataset.

Introduction

We explored three models and ultimately chose the Random Forest model, which proved to be the most suitable for our dataset. We then fine-tuned the hyperparameters to obtain the best estimators and identified the top 5 features. Finally, we trained a reduced model using these features.

Contents

  • finding_donors.ipynb: Jupyter Notebook containing the implementation of Random Forest using Python.
  • report.html: An html page presenting the jupyter notebook.
  • README.md: This file providing an overview of the repository.
  • census.csv: This is the working dataset.

Requirements

To run the code in the Jupyter Notebook, you need to have Python installed on your system along with the following libraries:

  • NumPy
  • pandas
  • scikit-learn
  • matplotlib
  • seaborn You can install these libraries using pip:
pip install numpy pandas scikit-learn matplotlib seaborn

Usage

  1. Clone this repository to your local machine:
git clone https://github.com/BaraSedih11/finding_donors.git
  1. Navigate to the repository directory:
cd finding_donors
  1. Open and run the Jupyter Notebook finding_donors.ipynb using Jupyter Notebook or JupyterLab.

  2. Follow along with the code and comments in the notebook to understand how Random Forest and training and tuning is implemented using Python.

Acknowledgements

  • scikit-learn: The scikit-learn library for machine learning in Python.
  • NumPy: The NumPy library for numerical computing in Python.
  • pandas: The pandas library for data manipulation and analysis in Python.
  • matplotlib: The matplotlib library for data visualization in Python.
  • seaborn: The seaborn library for data visualization in Python.

About

First stage project at Udacity on the 'Intro to Machine Learning with TensorFlow' program using sckit-learn in python

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages