This is a tutorial explaining concept of Pipelines with practical implementation in Python and scikit-learn. I prepared it for my colleagues to clearly explain what pipelines are what benefits come with using them in our real work projects.
Since I received positive feedback, I would like to share it with more people.
Originally this content was a workshop materials and I explain in live, therefore I tailored it to be self-study friendly.
Assumptions:
- basic knowledge of Python
- understanding of machine learning basics (i.e. tutorial does not explain why we split data on train and test set)