-
Notifications
You must be signed in to change notification settings - Fork 26
Running dkv
Follow these instructions to launch a DKV container using the Dockerfile included.
$ curl -fsSL https://raw.githubusercontent.com/flipkart-incubator/dkv/master/Dockerfile | docker build -t dkv/dkv-deb9-amd64 -f - .
$ docker run -it dkv/dkv-deb9-amd64:latest dkvsrv --help
dkv has the following dependencies:
- Go version 1.13+
- RocksDB v6.5.3 as a storage engine
- GoRocksDB provides the CGo bindings with RocksDB
- Badger v1.6 as a storage engine
- Nexus for sync replication over Raft consensus
- Ensure HomeBrew is installed
- brew install gcc49
- brew install zstd
- Install rocksdb v6.5.3
wget https://raw.githubusercontent.com/Homebrew/homebrew-core/08d9fffc81b18935fac33af0e69cb277eae93da3/Formula/rocksdb.rb
brew style --fix rocksdb.rb
brew install -s rocksdb.rb
$ mkdir -p ${GOPATH}/src/github.com/flipkart-incubator
$ cd ${GOPATH}/src/github.com/flipkart-incubator
$ git clone https://github.com/flipkart-incubator/dkv
$ cd dkv
$ make build
If you want to build for other platform, set GOOS
, GOARCH
environment variables. For example, build for macOS like following:
$ make GOOS=linux build
Once DKV is built, the <PROJECT_ROOT>/bin
folder should contain the following binaries:
-
dkvsrv
- DKV server program -
dkvctl
- DKV client program -
dkvbench
- DKV benchmarking program
A single DKV instance can be launched using the following command:
$ ./bin/dkvsrv \
-db-folder <folder_name> \
-listen-addr <host:port> \
-db-engine <rocksdb|badger>
$ ./bin/dkvctl -dkvAddr <host:port> -set <key> <value>
$ ./bin/dkvctl -dkvAddr <host:port> -get <key>
Example session:
$ ./bin/dkvsrv -db-folder /tmp/db -listen-addr 127.0.0.1:8080 -db-engine rocksdb
$ ./bin/dkvctl -dkvAddr 127.0.0.1:8080 -set foo bar
$ ./bin/dkvctl -dkvAddr 127.0.0.1:8080 -get foo
bar
$ ./bin/dkvctl -dkvAddr 127.0.0.1:8080 -set hello world
$ ./bin/dkvctl -dkvAddr 127.0.0.1:8080 -get hello
world
$ ./bin/dkvctl -dkvAddr 127.0.0.1:8080 -del foo
$ ./bin/dkvctl -dkvAddr 127.0.0.1:8080 -iter "*"
hello => world
This launch configuration allows for synchronously replicating changes to DKV keyspace on multiple instances spread across independently failing regions or availability zones. Typically such configurations are deployed over WANs so as to ensure better read & write availability in the face of individual cluster failures and disasters.
Under the hood, we use Nexus to replicate
keyspace mutations across multiple DKV instances using the RAFT consensus protocol.
Currently, the put
API automatically replicates changes when the request is handled
by given DKV instance started in a special distributed mode (see below). However, get
and multiget
APIs targetting such an instance serve the data from its own local store.
Hence such calls may or may not reflect the latest changes to the keyspace and hence are
not linearizable. In the future, these APIs will be enhanced to support linearizability.
Assuming you have 3 availability zones, run the following 3 commands one in every zone in order to setup these instances for synchronous replication.
$ ./bin/dkvsrv \
-db-folder <folder_path> \
-listen-addr <host:port> \
-role master \
-nexus-node-url http://<host:port> \ #optional when running on separete nodes.
-nexus-cluster-url <cluster_url>
All these 3 DKV instances form a database cluster each listening on separate ports for
Nexus & client communications. One can now construct the value for nexusClusterUrl
param
in the above command using this example setup below:
NexusNodeId | Hostname | NexusPort |
---|---|---|
1 | dkv.az1 | 9020 |
2 | dkv.az2 | 9020 |
3 | dkv.az3 | 9020 |
Then the value for nexus-cluster-url
must be:
"http://dkv.az1:9020,http://dkv.az2:9020,http://dkv.az3:9020"
Note that same value must be used in each of the 3 commands used to launch the DKV cluster.
Subsequently, dkvctl
utility can be used to perform keyspace mutations against any one
of the DKV instances which are then automatically replicated to the other 2 instances.
Example session on local machine:
Launch Node 1:
$ ./bin/dkvsrv \
-db-folder /tmp/dkvsrv/n1 \
-listen-addr 127.0.0.1:9081 \
-role master \
-nexus-node-url http://127.0.0.1:9021 \
-nexus-cluster-url "http://127.0.0.1:9021,http://127.0.0.1:9022,http://127.0.0.1:9023"
Launch Node 2:
$ ./bin/dkvsrv \
-db-folder /tmp/dkvsrv/n2 \
-listen-addr 127.0.0.1:9082 \
-role master \
-nexus-node-url http://127.0.0.1:9022 \
-nexus-cluster-url "http://127.0.0.1:9021,http://127.0.0.1:9022,http://127.0.0.1:9023"
Launch Node 3:
$ ./bin/dkvsrv \
-db-folder /tmp/dkvsrv/n3 \
-listen-addr 127.0.0.1:9083 \
-role master \
-nexus-node-url http://127.0.0.1:9023 \
-nexus-cluster-url "http://127.0.0.1:9021,http://127.0.0.1:9022,http://127.0.0.1:9023"
Add a fourth node to the above 3 node cluster:
$ ./bin/dkvsrv \
-db-folder /tmp/dkvsrv/n4 \
-listen-addr 127.0.0.1:9084 \
-role master \
-nexus-node-url http://127.0.0.1:9024 \
-nexus-cluster-url "http://127.0.0.1:9021,http://127.0.0.1:9022,http://127.0.0.1:9023" \
-nexusJoin
Add this node to the existing 3 node cluster:
$ ./bin/dkvctl -dkvAddr 127.0.0.1:9081 -addNode "http://127.0.0.1:9024"
This launch configuration allows for DKV instances to be started either as a master node or a slave node. All mutations are permitted only on the master node while one or more slave nodes asynchronously replicate the changes received from master and make them available for reads. In other words, no keyspace mutations are permitted on the slave nodes, except by the replication stream received from master node.
The built-in replication mechanism guarantees sequential consistency for reads executed from the slave nodes. Moreover, all slave nodes will eventually converge to an identical state which is often referred to as strong eventual consistency.
Such a configuration is typically deployed on applications where the typical number of reads far exceed the number of writes.
First launch the DKV master node using the RocksDB engine with this command:
$ ./bin/dkvsrv \
-db-folder <folder_name> \
-listen-addr <host:port> \
-role master
Then launch the DKV slave node using either RocksDB or Badger engine with this command:
$ ./bin/dkvsrv \
-db-folder <folder_name> \
-listen-addr <host:port> \
-db-engine <rocksdb|badger> \
-role slave \
-repl-master-addr <dkv_master_listen_addr>
Subsequently, any mutations performed on the master node's keyspace using dkvctl
will be applied automatically onto the slave node's keyspace. By default, a given
slave node polls for changes from its master node once every 5 seconds. This can
be changed through the replPollInterval
flag while launching the slave node.
Note that only rocksdb engine is supported on the DKV master node while the slave node can be launched with either rocksdb or badger storage engines.
For slave nodes using the Badger storage engine, we also support an in-memory mode
where the entire dataset is stored in RAM without any writes to disk whatsoever.
This can be achieved by using the -dbDiskless
option during launch as shown here.
$ ./bin/dkvsrv \
-diskless \
-listen-addr <host:port> \
-db-engine badger \
-role slave \
-repl-master-addr <dkv_master_listen_addr>
This mode may provide better performance for reads and is also useful for deployments that are cache-like having optional durability requirements.