DKV is a distributed key value store server written in Go. It exposes all its functionality over gRPC & Protocol Buffers.
- Data Sharding
- Tunable consistency
- Data replication over WANs
- createVBucket(replicationFactor)
- put(K,V,vBucket)
- del(K,vBucket)
- get(K,consistency)
- Go version 1.13+
- RocksDB v6.5.3 as a storage engine
- GoRocksDB provides the CGo bindings with RocksDB
- Badger v1.6 as a storage engine
- Nexus for sync replication over Raft consensus
Follow these instructions to launch a DKV container using the Dockerfile included.
$ curl -fsSL https://raw.githubusercontent.com/flipkart-incubator/dkv/master/Dockerfile | docker build -t dkv/dkv-deb9-amd64 -f - .
$ docker run -it dkv/dkv-deb9-amd64:latest dkvsrv --help
DKV depends on RocksDB, and its CGo bindings, so we need to install rocksdb along with its dependecies.
- Ensure HomeBrew is installed
brew install gcc49
brew install rocksdb zstd
$ mkdir -p ${GOPATH}/src/github.com/flipkart-incubator
$ cd ${GOPATH}/src/github.com/flipkart-incubator
$ git clone https://github.com/flipkart-incubator/dkv
$ cd dkv
$ make build
If you want to build for other platform, set GOOS
, GOARCH
environment variables. For example, build for macOS like following:
$ make GOOS=linux build
Once DKV is built, the <PROJECT_ROOT>/bin
folder should contain the following binaries:
dkvsrv
- DKV server programdkvctl
- DKV client program
A single DKV instance can be launched using the following command:
$ ./bin/dkvsrv \
-db-folder <folder_name> \
-listen-addr <host:port> \
-db-engine <rocksdb|badger>
$ ./bin/dkvctl -dkvAddr <host:port> -set <key> <value>
$ ./bin/dkvctl -dkvAddr <host:port> -get <key>
Example session:
$ ./bin/dkvsrv -db-folder /tmp/db -listen-addr 127.0.0.1:8080 -db-engine rocksdb
$ ./bin/dkvctl -dkvAddr 127.0.0.1:8080 -set foo bar
$ ./bin/dkvctl -dkvAddr 127.0.0.1:8080 -get foo
bar
$ ./bin/dkvctl -dkvAddr 127.0.0.1:8080 -set hello world
$ ./bin/dkvctl -dkvAddr 127.0.0.1:8080 -get hello
world
$ ./bin/dkvctl -dkvAddr 127.0.0.1:8080 -del foo
$ ./bin/dkvctl -dkvAddr 127.0.0.1:8080 -iter "*"
hello => world
This launch configuration allows for synchronously replicating changes to DKV keyspace on multiple instances spread across independently failing regions or availability zones. Typically such configurations are deployed over WANs so as to ensure better read & write availability in the face of individual cluster failures and disasters.
Under the hood, we use Nexus to replicate
keyspace mutations across multiple DKV instances using the RAFT consensus protocol.
Currently, the put
API automatically replicates changes when the request is handled
by given DKV instance started in a special distributed mode (see below). However, get
and multiget
APIs targetting such an instance serve the data either from its own local store
or from the current raft leader based on the consistency
factor.
Assuming you have 3 availability zones, run the following 3 commands one in every zone in order to setup these instances for synchronous replication.
$ ./bin/dkvsrv \
-db-folder <folder_path> \
-listen-addr <host:port> \
-role master \
-nexus-node-url http://<host:port> \ #optional when running on separete nodes.
-nexus-cluster-url <cluster_url>
All these 3 DKV instances form a database cluster each listening on separate ports for
Nexus & client communications. One can now construct the value for nexusClusterUrl
param
in the above command using this example setup below:
NexusNodeId | Hostname | NexusPort |
---|---|---|
1 | dkv.az1 | 9020 |
2 | dkv.az2 | 9020 |
3 | dkv.az3 | 9020 |
Then the value for nexusClusterUrl
must be:
"http://dkv.az1:9020,http://dkv.az2:9020,http://dkv.az3:9020"
Note that same value must be used in each of the 3 commands used to launch the DKV cluster.
Subsequently, dkvctl
utility can be used to perform keyspace mutations against any one
of the DKV instances which are then automatically replicated to the other 2 instances.
Example session on local machine:
Launch Node 1:
$ ./bin/dkvsrv \
-db-folder /tmp/dkvsrv/n1 \
-listen-addr 127.0.0.1:9081 \
-role master \
-nexus-node-url http://127.0.0.1:9021 \
-nexus-cluster-url "http://127.0.0.1:9021,http://127.0.0.1:9022,http://127.0.0.1:9023"
Launch Node 2:
$ ./bin/dkvsrv \
-db-folder /tmp/dkvsrv/n2 \
-listen-addr 127.0.0.1:9082 \
-role master \
-nexus-node-url http://127.0.0.1:9022 \
-nexus-cluster-url "http://127.0.0.1:9021,http://127.0.0.1:9022,http://127.0.0.1:9023"
Launch Node 3:
$ ./bin/dkvsrv \
-db-folder /tmp/dkvsrv/n3 \
-listen-addr 127.0.0.1:9083 \
-role master \
-nexus-node-url http://127.0.0.1:9023 \
-nexus-cluster-url "http://127.0.0.1:9021,http://127.0.0.1:9022,http://127.0.0.1:9023"
Launch Node 4, not yet part of the cluster:
$ ./bin/dkvsrv \
-db-folder /tmp/dkvsrv/n4 \
-listen-addr 127.0.0.1:9084 \
-role master \
-nexus-node-url http://127.0.0.1:9024 \
-nexus-cluster-url "http://127.0.0.1:9021,http://127.0.0.1:9022,http://127.0.0.1:9023" \
-nexusJoin
Add this node to the existing 3 node cluster:
$ ./bin/dkvctl -dkvAddr 127.0.0.1:9081 -addNode "http://127.0.0.1:9024"
List members of the cluster from any member:
$ ./bin/dkvctl -dkvAddr 127.0.0.1:9082 -listNodes
Connecting to DKV service at 127.0.0.1:9082...DONE
Current DKV cluster members:
6ab6bfdf9f29bd1a => http://127.0.0.1:9094 (leader)
619f1e32973e3e7a => http://127.0.0.1:9092
785bb0a54cd7f8d5 => http://127.0.0.1:9093
c63380dd0a493345 => http://127.0.0.1:9091
Remove a node from the cluster:
$ ./bin/dkvctl -dkvAddr 127.0.0.1:9082 -removeNode http://127.0.0.1:9091
Connecting to DKV service at 127.0.0.1:9082...DONE
Confirm node removal from the cluster:
$ ./bin/dkvctl -dkvAddr 127.0.0.1:9082 -listNodes
Connecting to DKV service at 127.0.0.1:9082...DONE
Current DKV cluster members:
6ab6bfdf9f29bd1a => http://127.0.0.1:9094 (leader)
619f1e32973e3e7a => http://127.0.0.1:9092
785bb0a54cd7f8d5 => http://127.0.0.1:9093
This launch configuration allows for DKV instances to be started either as a master node or a slave node. All mutations are permitted only on the master node while one or more slave nodes asynchronously replicate the changes received from master and make them available for reads. In other words, no keyspace mutations are permitted on the slave nodes, except by the replication stream received from master node.
The built-in replication mechanism guarantees sequential consistency for reads executed from the slave nodes. Moreover, all slave nodes will eventually converge to an identical state which is often referred to as strong eventual consistency.
Such a configuration is typically deployed on applications where the typical number of reads far exceed the number of writes.
First launch the DKV master node using the RocksDB engine with this command:
$ ./bin/dkvsrv \
-db-folder <folder_name> \
-listen-addr <host:port> \
-role master
Then launch the DKV slave node using either RocksDB or Badger engine with this command:
$ ./bin/dkvsrv \
-db-folder <folder_name> \
-listen-addr <host:port> \
-db-engine <rocksdb|badger> \
-role slave \
-repl-master-addr <dkv_master_listen_addr>
Subsequently, any mutations performed on the master node's keyspace using dkvctl
will be applied automatically onto the slave node's keyspace. By default, a given
slave node polls for changes from its master node once every 5 seconds. This can
be changed through the replPollInterval
flag while launching the slave node.
Note that only rocksdb engine is supported on the DKV master node while the slave node can be launched with either rocksdb or badger storage engines.
For slave nodes using the Badger storage engine, we also support an in-memory mode
where the entire dataset is stored in RAM without any writes to disk whatsoever.
This can be achieved by using the -dbDiskless
option during launch as shown here.
$ ./bin/dkvsrv \
-diskless \
-listen-addr <host:port> \
-db-engine badger \
-role slave \
-repl-master-addr <dkv_master_listen_addr>
This mode may provide better performance for reads and is also useful for deployments that are cache-like having optional durability requirements.
Detailed documentation on specific features, design principles, data guarantees etc. can be found in the dkv Wiki
If you want to execute tests inside DKV, run this command:
$ make test
$ make GOOS=linux dist
$ make GOOS=darwin dist
dkv is undergoing active development. Consider joining the dkv-interest Google group for updates, design discussions, roadmap etc. in the initial stages of this project.