Conversation
Currently the lifecycle of a document is a little tricky to follow. We
have an xstate state machine in `DocHandle`, but the events which modify
that state machine are emitted by various asynchronous processes which
have a reference to the document scattered throughout the system. This
makes it hard to answer questions like "why is this document
unavailable" because we don't have a single place in the codebase where
we have all the information used to make that decision at the same time.
This commit is an attempt to reorganise state management to make these
kinds of questions easier to answer. Roughly speaking it looks like
this:
* Replace the `CollectionSynchronizer` and `DocSynchronizer` classes
with a single `DocumentPhasor` class, which manages the lifecycle
of a single document.
* Move all the state logic out of `DocHandle`. `DocHandle`s are now
just a place for users to register interest in a document and get
notified of changes. All actual state changes are routed through
the `DocumentPhasor`
The "DocumentPhasor" state machine manages a set of `Phase`s, each of
which represents different stages of the document lifecycle ("loading",
"requesting", "ready", "unavailable"). Modifications to the phasor are
achieved by passing events to the `DocumentPhasor.handleEvent` method,
which returns an object describing what has changed.
msakrejda
left a comment
There was a problem hiding this comment.
I didn't have time to look at the full thing this morning, but I made it through most of DocumentPhasor. In general, I really like this approach: I think it achieves the goals you laid out in the overview. I left some nitpicks and a bunch of questions, and I think I need to read it over another couple of times to fully understand it, but this is a great direction.
| unload() { | ||
| this.#machine.send({ type: UNLOAD }) | ||
| } | ||
| unload() {} | ||
|
|
||
| /** Called by the repo to reuse an unloaded handle. */ | ||
| reload() { | ||
| this.#machine.send({ type: RELOAD }) | ||
| } | ||
| reload() {} |
There was a problem hiding this comment.
If these are no-ops now, should they be deprecated?
|
|
||
| /** Called by the repo when the document is deleted. */ | ||
| /** Called by the repo when the document is deleted. | ||
| * @deprecated Use Repo#delete instead |
There was a problem hiding this comment.
Should we be deprecating this? Since you can update a document via its handle, it seems reasonable to want to delete it via its handle, too, no? No strong feelings, just wondering.
| // | ||
| // ### Connections | ||
| // | ||
| // Each `DocumentPhasor` manages it's own set of connected peers. The caller |
There was a problem hiding this comment.
| // Each `DocumentPhasor` manages it's own set of connected peers. The caller | |
| // Each `DocumentPhasor` manages its own set of connected peers. The caller |
| // sync state and a share policy before it can be used to send messages (or not, | ||
| // as the case maybe). Callers should provide these via the |
There was a problem hiding this comment.
... (or not, as the case maybe).
What does this mean?
| // Phase transitions always happen after applying the new event and as a result of the | ||
| // `Phase.transition` method, which makes it easy to understand why transitions occur. |
There was a problem hiding this comment.
I find the wording here confusing. Is this equivalent to
| // Phase transitions always happen after applying the new event and as a result of the | |
| // `Phase.transition` method, which makes it easy to understand why transitions occur. | |
| // Phase transitions always happen as a result of the `Phase.transition` method, which may | |
| // happen after applying a new event. This makes it easy to understand why transitions occur. |
? And localChange never generates a phase transition?
| this.#outboundEphemeralMessages.push(event.message) | ||
| break | ||
| } | ||
| case "reload": { |
There was a problem hiding this comment.
This event does not affect what peers we know about, right?
| } | ||
| break | ||
| } | ||
| case "peer_removed": |
There was a problem hiding this comment.
On peer_added, we tell our current phase about the peer. Do we need a corresponding relay to the current phase here?
| const exhaustivenessCheck: never = event | ||
| throw new Error(`Unhandled event type: ${exhaustivenessCheck}`) |
There was a problem hiding this comment.
Do we want event or event.type here?
| let stateChange = null | ||
| if (before.phase?.name !== this.#currentPhase.name) { | ||
| stateChange = { | ||
| before: before.phase?.name || "loading", // TODO: introduce a "starting" phase? |
There was a problem hiding this comment.
No strong feelings on a "starting" phase yet, but I'm a fan of sentinel values like that in general...
| break | ||
| default: | ||
| const exhaustiveCheck: never = transition | ||
| throw new Error(`Unhandled transition: ${exhaustiveCheck}`) |
There was a problem hiding this comment.
Similar to above, should this be transition.to?
msakrejda
left a comment
There was a problem hiding this comment.
I took another look at this, focusing on the phases. I still think it broadly makes sense, though the interface between DocumentPhasor and the phases seems pretty intricate. Maybe that's necessary complexity, but I wonder if we can avoid some of it. E.g., loadRunning seems like a quirk.
I also feel like I'm not quite clear on the sync protocol: are there docs on that?
| case "not_requesting_due_to_sharepolicy": | ||
| if (shouldShare) { | ||
| // This would happen if the sharePolicy changed since the last time we checked | ||
| // somehow |
There was a problem hiding this comment.
Does this mean that sharePolicyChanged above should not be empty? Or is that not relevant here?
| state: "awaiting_send", | ||
| }) | ||
| return this.generateMessage({ | ||
| docId: this.#documentId, |
There was a problem hiding this comment.
this.generateMessage doesn't take a docId?
| // | ||
| // Initially a phasor is in a "network not ready" state. This means that it will not mark | ||
| // documents as unavailable until the "network_ready" event has been received. Callers | ||
| // should dispatch this event as soon as the network subsystem says it is ready. |
There was a problem hiding this comment.
I don't think this refactor should change the state machine, but in the long term, it would be nice to support the network going in and out of the "ready" state. Have you considered how that fits in here?
Currently the lifecycle of a document is a little tricky to follow. We have an xstate state machine in
DocHandle, but the events which modify that state machine are emitted by various asynchronous processes which have a reference to the document scattered throughout the system. This makes it hard to answer questions like "why is this document unavailable" because we don't have a single place in the codebase where we have all the information used to make that decision at the same time.This commit is an attempt to reorganise state management to make these kinds of questions easier to answer. Roughly speaking it looks like this:
CollectionSynchronizerandDocSynchronizerclasses with a singleDocumentPhasorclass, which manages the lifecycle of a single document.DocHandle.DocHandles are now just a place for users to register interest in a document and get notified of changes. All actual state changes are routed through theDocumentPhasorThe "DocumentPhasor" state machine manages a set of
Phases, each of which represents different stages of the document lifecycle ("loading", "requesting", "ready", "unavailable"). Modifications to the phasor are achieved by passing events to theDocumentPhasor.handleEventmethod, which returns an object describing what has changed.This PR is still in-progress because
a) I think the state machine logic is pretty verbose and can probably be made more succinct
b) I need to write up a few things which are not covered by tests relating to the metrics we use for monitoring the sync server
c) I need to go through in detail looking for things which are not covered by tests in the original implementation as this is a large change.