-
I just started experimenting with Openraft and first off, its a really nice project, thanks a lot for this! Currently, I am investigating and implementing an embeddable SQLite which replicates itself via openraft and so far, everything is working very nicely. But, I came across very low non-expected results in throughput when doing first very simple benchmarks which could be considered production ready (real network, persistence to disk, ...). The implementation achieves very poor write performance and after a lot of simplifying and reducing variables that might impact the result, I ended up measuring the latency when my client implementation calls All operations, even inserts into the database or modifying the raft logs on disk are done in < 10ms with my test values. Most of the network requests show a latency of 0-10ms, even when many logs are inserted at the same time, but from time to time the latency jumps up to 40 ms. So, either I have screwed up badly somewhere and got something totally wrong, or is is possible that something is blocking internally? Does anyone have an idea what the reason could be? Unfortunately, the project is in a very ugly state right now since I am just evaluating and playing around with different network implementations, so it's not public yet. Edit: It seems that the increasing latency over time is a problem with my KV store. I am using redb here. I can probably fix that and it has nothing to do with openraft. However, I still cannot get rid of the 40ms spikes. Edi 2: I just noticed the #[allow(clippy::blocks_in_conditions)]
impl RaftNetwork<TypeConfig> for NetworkConnection in an example. Can this have something to do with it or is this an old annotation? |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 13 replies
-
Have you tested the performance when using only mem store? eg: https://github.com/datafuselabs/openraft/tree/main/examples/raft-kv-memstore-network-v2 |
Beta Was this translation helpful? Give feedback.
-
Ahh okay, got it. Is the
The fix will be pretty easy, I can basically just re-use my approach with Regarding the performance issues I had with When you take a look at facebooks documentation for rocksdb, they mention that it's crash safe with default settings. However, the rust wrappers default for the wal sync is set to When I wait for sync with rocksdb, it is actually even ~40% slower during batch insertions in my tests compared to redb. I mean, both of them give me a very low throughput for the Raft when I wait for sync to disk each time, but this at least explains the huge difference I got in the beginning. To make it crash safe, I am syncing the WAL with rocksdb like this: let mut opts = WriteOptions::default();
opts.set_sync(true);
self.db
.write_opt(batch, &opts)
.map_err(|err| StorageIOError::write_logs(&err))?; I then end up at ~780 put / s with rocksdb with in this case 16 concurrent writers compared to ~15k with the same settings when sync is off (default). This brings me to a another question (sorry ^^). I mean, the likelihood of a full crash is rather low and to make it safe, I would need to sacrifice a huge amount of throughput. |
Beta Was this translation helpful? Give feedback.
-
Just fyi, if you are interested in it, I open sourced a very first not-yet-ready version of the project, thanks to |
Beta Was this translation helpful? Give feedback.
I am using
openraft-0.9.13
.I tested on 2 different machines, both linux:
5.14.0-427.24.1.el9_4.x86_64
6.9.8-200.fc40.x86_64
I added more
Instant
-checks in a few places for additional debugging. I was usingreqwest
with connection pooling before which was a huge improvement over single HTTP calls from my very first testing to really understand what openraft is doing and how. In the end, the spikes were actually coming fromreqwest
, most probably an internal lock for the connection pool, because it did never happen without sending heavy loads.I ended up with writing a lower level WebSocket impl with the
fastwebsockets
crate for the Raft i…