Skip to content

This Jupyter notebook serves as a machine learning template to quickly make predictions and analyse feature importance in a dataset.

License

Notifications You must be signed in to change notification settings

xPrithvi/Random-Forest-Regressor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Random Forest Regressor

This Jupyter notebook serves as part of the data science pipeline by providing a quick and easy framework to perform feature enginnering, model training and feature importance analysis for data exploration. In this particular notebook, Sci-Kit Learn's RandomForestRegressor was trained on information regarding housing in Perth to numerically predict house prices based on floor space, suburb, number of bedrooms, etc. Feature importance analysis was performed using built-in methods that calculate importance by node impurity. However, SHAP was also used to provide a more robust and in-depth analysis via Shapley values.

Features

  • Model saving and loading.
  • Hyperparameter tuning via Bayesian optimization.
  • Feature importance analysis using tree node impurity and Shapley values.

Future Improvements

  • Custom user input to the model (involves writting a custom data encoder instead of using pandas.get_dummies()).
  • Reducing the disk size of saved models.

About

This Jupyter notebook serves as a machine learning template to quickly make predictions and analyse feature importance in a dataset.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published