chapter 02 reading note https://github.com/gaoxuesong/learning-spark-lightning-fast-big-data-analysis/blob/master/chapter02.pdf