How to use Python, Pandas, Numpy, and Scikit-Learn to do some basic data cleaning and preprocessing
This is the repo for the Towards Data Science article, The complete beginner's guide to data cleaning and preprocessing
From the article
Data preprocessing is the first (and arguably most important) step toward building a working machine learning model. It’s critical!
If your data hasn’t been cleaned and preprocessed, your model does not work.
It’s that simple.
Data preprocessing is generally thought of as the boring part. But it’s the difference between being prepared and being completely unprepared. It’s the difference between looking like a pro and looking pretty foolish.
It’s kind of like getting ready for a vacation. You might not like the preparation part, but tightening down the details in advance can save you from one nightmare of a trip.
You just have to do it or you can’t start having fun.