Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add raft example #153

Open
wants to merge 33 commits into
base: main
Choose a base branch
from
Open

Conversation

Elbehery
Copy link
Member

@Elbehery Elbehery commented Feb 5, 2024

This PR add standalone raft example

resolves #2

This PR supersedes #67

The refactor commits are cherrypicked from https://github.com/Elbehery/raft/pull/1/commits

@Elbehery
Copy link
Member Author

Elbehery commented Feb 5, 2024

cc @mhagger @ahrtr

entries []raftpb.Entry
}

// MinimalEtcdVersion returns minimal etcd able to interpret entries from WAL log,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't review this in detail, but a couple of high-level comments:

  • A raft example should probably have no knowledge about etcd.
  • Please contain the example in a Go module, so that it's not part of the raft package module, and does not get imported by users.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it is will WIP .. i am working on it now .. i cherrypick the commit from https://github.com/Elbehery/raft/pull/1/commits as suggested by @ahrtr in #2 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding the go module now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pav-kv added the go mod here 9e38675

@Elbehery Elbehery changed the title Add raft example WIP: Add raft example Feb 5, 2024
// See the License for the specific language governing permissions and
// limitations under the License.

package verify
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verify package is used by etcd to inject additional runtime validation during tests. Basically etcd specific asserts.

Don't think you will need it for raftexample. You can just remove it and all it's references.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// See the License for the specific language governing permissions and
// limitations under the License.

package wal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version file should not be used as there is no need of WAL versioning for raftexample.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@Elbehery
Copy link
Member Author

Elbehery commented Feb 5, 2024

@serathius @pav-kv @ahrtr

i have added the wal and its deps from etcd repo, i know that this should be standalone without any deps.

however, the raft example requires wal. I need suggestions how to proceed

same issue exist with rafthttp.Transport.

@pav-kv
Copy link
Contributor

pav-kv commented Feb 5, 2024

@Elbehery Do you intend the example to be a fully-functional server with a fully-functional storage, and run each node in a different process?

My sense is that, for example purposes, you can cut corners, and make Storage and transport simple and in-memory. For example, you can use MemoryStorage from this repo. For implementing transport between nodes, you can take some inspiration from tests. All nodes can be in the same process, and transport can shuffle messages from one in-memory node to another.

I.e. things like wal are probably not needed.

Also, what kind of state-machine/commands do you want to support? I think it can also be super-simple, like a single-value register with a put and/or conditional-put command. Doesn't need to be a fully-fledged key-value store.

I think a much smaller example can be started up from taking code snippets in the README, and putting them together to work. It can be done gradually, for instance you don't need to support config changes in the first version.

@Elbehery
Copy link
Member Author

Elbehery commented Feb 6, 2024

@Elbehery Do you intend the example to be a fully-functional server with a fully-functional storage, and run each node in a different process?

My sense is that, for example purposes, you can cut corners, and make Storage and transport simple and in-memory. For example, you can use MemoryStorage from this repo. For implementing transport between nodes, you can take some inspiration from tests. All nodes can be in the same process, and transport can shuffle messages from one in-memory node to another.

Thanks @pav-kv for your comment

I actually also agree to keep it simple since it is an example. However, I think the example should at least have each raft node in a separate process iiuc

wdyt @ahrtr @serathius ?

This is taken straight from

    github.com/etcd-io/etcd:client/pkg/fileutil/lock_flock.go

Signed-off-by: Michael Haggerty <[email protected]>
Signed-off-by: Michael Haggerty <[email protected]>
This interface defines what `kvstore` needs for saving and loading
snapshots.

Signed-off-by: Michael Haggerty <[email protected]>
This interface defines what `httpKVAPI` needs as the backing key-value
store.

Signed-off-by: Michael Haggerty <[email protected]>
Initialize the snapshot storage in `newRaftNode()` rather than in
`startRaft()`. This is a step towards getting rid of the
`snapshotStorageReady` channel.

Signed-off-by: Michael Haggerty <[email protected]>
This is another step towards getting rid of `snapshotStorageReady`.

Signed-off-by: Michael Haggerty <[email protected]>
Rename `newRaftNode()` to `startRaftNode()`, and change it to replay
the WAL synchronously, before returning. This doesn't change anything,
because the callers were all waiting for `snapshotStorageReady` before
proceeding anyway.

Signed-off-by: Michael Haggerty <[email protected]>
Use a local variable (in `startRaftNode()`) instead.

Signed-off-by: Michael Haggerty <[email protected]>
This is more consistent with the types used elsewhere for IDs.

Signed-off-by: Michael Haggerty <[email protected]>
This is the sort of thing that the caller should be able to inject,
and it's not part of the raft algorithm itself. This change makes it
clearer what a user of the raft algorithm must supply to it.

Signed-off-by: Michael Haggerty <[email protected]>
Extract two new methods, for clarity and to reduce code duplication.

Signed-off-by: Michael Haggerty <[email protected]>
Change `newKVStore()` to set up the initial state, but not start the
`readCommits()` goroutine anymore. Instead, change the callers to call
`processCommits()` (a renamed version of `readCommits()`) themselves
(in a goroutine).

The main benefit of this change is that the `kvstore` can be
instantiated before calling `startRaftNode()`, meaning that
`startRaftNode()` can be passed `kvs.getSnapshot` as an argument.
Previously, it had to be passed an anonymous function that refers to a
variable (`kvs`) that has not yet been initialized. This asynchrony
was confusing and unnecessary.

Signed-off-by: Michael Haggerty <[email protected]>
It wasn't doing anything useful.

Signed-off-by: Michael Haggerty <[email protected]>
Add a new interface, `FSM`, which represents a finite state machine
that can be driven by raft.

Also add `kvfsm`, which is an `FSM` view of a `kvstore`. Treating
`kvstore` and `kvfsm` as separate types means that the public
interface of `kvstore` is not contaminated with internal methods that
are only there to support raft and should not be invoked by the user.

Signed-off-by: Michael Haggerty <[email protected]>
Make `cluster` hold an array of `peer`s, rather than one array for
each peer attribute.

Continue storing the names separately, since this `[]string` has to be
passed to `startRaftNode()`.

Signed-off-by: Michael Haggerty <[email protected]>
The tests can be made cleverer if each `peer` in a `cluster` can be
instantiated with an arbitrary finite state machine. We'll use this to
make some tests able to operate more like normal clients of
`raftNode`s.

Signed-off-by: Michael Haggerty <[email protected]>
mhagger and others added 14 commits February 21, 2024 14:16
Handle such errors at the caller, instead.

Signed-off-by: Michael Haggerty <[email protected]>
This is a step towards moving `ProcessCommits()` out of this interface
and into `raftNode`.

Signed-off-by: Michael Haggerty <[email protected]>
Instead, make `LoadAndApplySnapshot()` part of the `FSM` interface,
and have the callers of `newKVStore()` invoke that method after
calling `newKVStore()`. This is a step towards moving this
functionality entirely out of `FSM`.

Signed-off-by: Michael Haggerty <[email protected]>
Now that the `FSM` interface is more capable, there is no reason that
this method, which has more to do with raft anyway, can't be
implemented in `raftNode`. This makes it easier to implement new FSMs
without having to reimplement this method.

This means that `newRaftNode()` has to return a pointer to the
`raftNode`, so make that change, too. This change will make other
future changes simpler, too.

Signed-off-by: Michael Haggerty <[email protected]>
Make this a standard part of starting up a node.

Signed-off-by: Michael Haggerty <[email protected]>
This is the last use of `kvstore.snapshotStorage`, so remove that
field as well.

Signed-off-by: Michael Haggerty <[email protected]>
The old method, monitoring the `errorC` channel, is not great because
the error only pops out of that channel once. Which of the pieces of
code that are waiting on that channel reads the error? Nobody knows!

Instead, provide a `Done()` method and an `Err()` method that
interested parties can use to determine when the node finishes and
what error it returned.

The callers haven't been changed yet.

Signed-off-by: Michael Haggerty <[email protected]>
Use the `done` channel instead of the `errorC` channel for monitoring
for the completion of the raft node. If the channel is closed, don't
log the error, since the caller of `ProcessCommits()` is already
taking care of that.

Since this means that we're not killing the server by aborting the
program, we now have to call `srv.Close()` explicitly.

Signed-off-by: Michael Haggerty <[email protected]>
Instead of driving the channels within the test, user a cleverer FSM
and call `ProcessCommits()` to manage the channels.

The old version of the test only looked at the first data item in each
`commit`, even though multiple data items can be sent in a single
`commit`. It also only looked at the first 100 commits, even though
there is a multiplication factor because each node initiates 100
proposals. The new version checks all data items, and checks that the
total number of commits is as expected (namely, 100 for each node, or
300 total).

Signed-off-by: Michael Haggerty <[email protected]>
Keep `commitC` and `errorC` internal to `raftNode`: don't return them
from `newRaftNode()`, and don't require them as arguments to
`(*raftNode).ProcessCommits()`.

(Some tests still need them, but they can access the members directly
on the `raftNode` instance.)

Signed-off-by: Michael Haggerty <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement a self-contained raft example
5 participants