A big data scenario to practise how to manage, optimize millions of data

Big data scenario is not patent of big company, we could design a scenario, then mock millions, billions of data, store into database.

Features

Super fast data generation, concurrently generate billions data in Node.js cluster mode.
Full problem list, to track detail problems we will facing under big data scenario.
Data models mocked from an living website: https://zhihu.com, a Quora-like Ask && Answer product.
Friendly to all levels developers, easy to set up, and full tutorials to help.

You should owns a little bit high performance computer, which will speed up your practise, give you my PC as an example:

Bigger data set requires more resource, especially large amount of disk, when your practise involved backup, partion, replica and sharding

Full tutorial to begin your practise