Train a regression model to predict house pricing using historical data. More details are described here.
For this project I decided to use regression models, specifically random forests.
-
EDA
- using pipeline to handle missing values, transform numerical and categorical features, feature engineer, drop unwanted features:
-
Model & learning curve
The most important thing I learned and applied in this project is using sklearn pipeline.
- House Prices - all done via pipeline by Alexander Scarlat MD
- ML Data Pipelines with Custom Transformers in Python by Sam T
- End-to-end Machine Learning project on predicting housing prices using Regression by Gurupratap S Matharu
- Feature Transformation for Machine Learning, a Beginners Guide by Rebecca Vickery
- Strategies for working with discrete, categorical data by Dipanjan (DJ) Sarkar
- Feature Engineering with sklearn Pipelines by FilipK
- Transformation & Scaling of Numeric Features: Intuition by Manu Sharma
- Categorical Data by Dipanjan (DJ) Sarkar
- Normality Testing - Skewness and Kurtosis
- Python Inheritance
- Python Objects and Classes