Skip to content

Latest commit

 

History

History
36 lines (28 loc) · 2.9 KB

README.md

File metadata and controls

36 lines (28 loc) · 2.9 KB

Repo for the Kaggle "House Price Advanced"

Problem definition

Train a regression model to predict house pricing using historical data. More details are described here.

Overview & Workflow

For this project I decided to use regression models, specifically random forests.

  • EDA

    • using pipeline to handle missing values, transform numerical and categorical features, feature engineer, drop unwanted features:

  • Model & learning curve

Afterthoughts

The most important thing I learned and applied in this project is using sklearn pipeline.

Resources

  1. House Prices - all done via pipeline by Alexander Scarlat MD
  2. ML Data Pipelines with Custom Transformers in Python by Sam T
  3. End-to-end Machine Learning project on predicting housing prices using Regression by Gurupratap S Matharu
  4. Feature Transformation for Machine Learning, a Beginners Guide by Rebecca Vickery
  5. Strategies for working with discrete, categorical data by Dipanjan (DJ) Sarkar
  6. Feature Engineering with sklearn Pipelines by FilipK
  7. Transformation & Scaling of Numeric Features: Intuition by Manu Sharma
  8. Categorical Data by Dipanjan (DJ) Sarkar
  9. Normality Testing - Skewness and Kurtosis
  10. Python Inheritance
  11. Python Objects and Classes