Shardsim - ScyllaDB / Apache Cassandra sharding simulator

Background

Both ScyllaDB and Apache Cassandra automatically distribute data across nodes based on a randomized algorithm. ScyllaDB in addition distributes data within a node among cores (using a static algorithm). These distributions are susceptible to over-utilization of a node or a core; since a cluster in general runs at the speed of the slowest node, this has significant applications on throughput.

Shardsim is a program that simulates the node- and core- data distribution algorithm with various parameters.

The parameters are:

--nodes - the number of nodes in the cluster. The simulation assumes RF=1 and no data centers.
--vnodes - the number of vnodes the database was configured with
--shards - the number of shards (logical cores) per node (ScyllaDB specific)
--ignore-msb-bits - ScyllaDB parameter to adjust the sharding algorithm to reduce shard over-utilization (to be described in a future blog post)

Building

shardsim requires a C++ compiler, cmake, and boost to be installed.

cmake .
make
./shardsim

Examples

12-node cluster with 32 vnodes, 24 logical cores, old ScyllaDB sharding algorithm:

$ ./shardsim --nodes 12 --vnodes 32 --shards 24 --ignore-msb-bits 0
12 nodes, 32 vnodes, 24 shards
maximum node overcommit:  1.32249
maximum shard overcommit: 5.302944

Some poor node is overcommitted by 32% over the average, and a single logical core is overcommitted 5X! Let's fix it by using 256 vnodes and the new ScyllaDB sharding algorithm:

$ ./shardsim --nodes 12 --vnodes 256 --shards 24 --ignore-msb-bits 12
12 nodes, 256 vnodes, 24 shards
maximum node overcommit:  1.06922
maximum shard overcommit: 1.088612

Success! Node overcommit is just 7% over the average, while shard overcommit is just 9%.

Trademark notice

Apache Cassandra® and Apache® are either trademarks or registered trademarks of the Apache Software Foundation in the United States and/or other countries.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
shardsim.cc		shardsim.cc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Shardsim - ScyllaDB / Apache Cassandra sharding simulator

Background

Building

Examples

Trademark notice

About

Releases

Packages

Contributors 2

Languages

License

avikivity/shardsim

Folders and files

Latest commit

History

Repository files navigation

Shardsim - ScyllaDB / Apache Cassandra sharding simulator

Background

Building

Examples

Trademark notice

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages