Skip to content

Conversation

@msakrejda
Copy link
Collaborator

This aims to move the awareness/presence functionality out of the
useLocalAwareness/useRemoteAwareness React hooks and make it available
as a core part of automerge-repo. The hooks can then be rewritten to
be simple wrappers around the lower-level code, and users of other
frameworks (or no frameworks at all) can benefit as well.

This approach introduces an Awareness EventEmitter class that adds
some structure around ephemeral-message events on a doc handle. It
also allows checking state by peerId or by caller-supplied userId.

It does not attempt to address the existing issue that userIds can be
spoofed.

@msakrejda msakrejda requested review from alexjg, chee and pvh November 15, 2025 01:15
@msakrejda
Copy link
Collaborator Author

This is still a draft, but I'd love some feedback on:

  • is this approach sensible overall?
  • is this a good building block for support in other frameworks (e.g. automerge-repo-solid-primitives)?

@OrionReed
Copy link
Contributor

OrionReed commented Nov 15, 2025

This is good. In fact, this afternoon I spent a couple hours implementing almost the exact same design with the same structure, heartbeat system, and events just with slightly different names. So I take that as a good sign.

The only difference with the thing I tried is that I added a 'direct message' system which orders peer ID pairs, hashes them, then bs58check encodes them to get a unique message channel between 2 peers (as handshakes are pairwise and a non-trivial amount of data, sending these to everyone felt like something worth avoiding)

I wonder what your thoughts are there. But as for the overall API design I like it. I'll pull this into my experiment and see how it feels. I think for React this is good, @chee can speak more to solid-js.

I'm also personally in favor of 'presence' over 'awareness' and I think it is a bit more widely used.

@msakrejda
Copy link
Collaborator Author

@OrionReed excellent, thanks for the feedback, and glad we've arrived at more or less the same design. No preference on "presence" versus "awareness" as terminology, but it probably makes sense to go with common usage.

The DM idea also makes sense. Is that still going through handle ephemeral messages? That still broadcasts it to all peers and they just ignore everything? DocSynchronizer doesn't seem to have a public way to send a message to a single peer. Or should we add that to support this use case?

@OrionReed
Copy link
Contributor

@msakrejda my approach was to make a new doc that only 2 peers share (a little hacky, making a new automerge URL based on the pair of peer IDs)

I think a better approach probably would be to use the same ephemeral messaging as the rest of the system, and leave more direct connections to a future performance optimisation (Keyhive is also a consideration here, so feels reasonable to punt)

I'm using your system right now, happy to chat tomorrow or Monday and let you know how it went and anything I learn along the way.

@OrionReed
Copy link
Contributor

  • Seems like dispose() should broadcast a goodbye?
  • Should there be a heartbeat timeout to remove peers if they've been gone a long time? I could see the case for leaving them forever if certain apps want that. In my current experiment I want a mostly 'live' peer list. perhaps this should be configurable?

@pvh
Copy link
Member

pvh commented Nov 16, 2025

@msakrejda my approach was to make a new doc that only 2 peers share (a little hacky, making a new automerge URL based on the pair of peer IDs)

I think a better approach probably would be to use the same ephemeral messaging as the rest of the system, and leave more direct connections to a future performance optimisation (Keyhive is also a consideration here, so feels reasonable to punt)

Just saying that while I think this is a worthwhile problem to talk about / work on, but I don't think we should try to solve that problem at the library level as part of this patch. I'd say that user-land solutions are generally preferable to library features as long as they have the right performance / reliability characteristics and don't involve jumping through too many hoops.

@OrionReed
Copy link
Contributor

I think its important to have some easy way to broadcast a goodbye when pages close. As folks will likely want control over behaviour for unmounting, changing tabs, and closing windows.

Usually this is done with events like visibilitychange and pagehide among others. Not entirely sure what this should look like yet but I think there should be a good answer before shipping.

@msakrejda
Copy link
Collaborator Author

Just saying that while I think this is a worthwhile problem to talk about / work on, but I don't think we should try to solve that problem at the library level as part of this patch.

Sure, makes sense.

I'd say that user-land solutions are generally preferable to library features as long as they have the right performance / reliability characteristics and don't involve jumping through too many hoops.

I guess it's "jumping through too many hoops" that's at issue here, but I think if we start with the core features, we can layer that on later without too much damage to the basic API.

@msakrejda
Copy link
Collaborator Author

  • Seems like dispose() should broadcast a goodbye?

Yeah, good point. Will update.

  • Should there be a heartbeat timeout to remove peers if they've been gone a long time?

We may want to leave that as a higher-level concern, but I could also see taking an idle_timeout option to prune these. I'll explore that.

@msakrejda
Copy link
Collaborator Author

I think its important to have some easy way to broadcast a goodbye when pages close. As folks will likely want control over behaviour for unmounting, changing tabs, and closing windows.

Usually this is done with events like visibilitychange and pagehide among others. Not entirely sure what this should look like yet but I think there should be a good answer before shipping.

Do you mean something automagic? As annoying as it would be, that feels like an application-level concern to me: the app should register for the relevant events and call dispose(). No?

@OrionReed
Copy link
Contributor

Do you mean something automagic? As annoying as it would be, that feels like an application-level concern to me.

Agreed for sure, I don't think anything needs to be done in this primitive, more of a pin to make sure we support this need.

We may want to leave that as a higher-level concern, but I could also see taking an idle_timeout option to prune these. I'll explore that.

I could see either being good. If the former it should be easy to broadcast a goodbye without needing to know the event structure, perhaps just a .goodbye() though I'm quite open to an idle_timeout, as a handy utility that many people will want which is still opt-in

@OrionReed
Copy link
Contributor

My takes after a few days of use:

  • call it 'Presence' as a slightly more idiomatic name.
  • consider private elements instead of TS private
  • add some docstrings (including mention of events used to say goodbye when closing the page on the .dispose() method)
  • add the missing goodbye broadcast on dispose
  • personally, I think an idle_timeout is worthwhile (and I think will save at least a few people from doing it wrong in React). I'm not sure if the default should be a timeout (say, 3x the heartbeat rate) or null, but I lean towards a default timeout and let people disable it if they really want to. I think that's what users will expect.

Aside from that this all looks nice!

Copy link
Collaborator Author

@msakrejda msakrejda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I need to flesh out the tests and doc strings a bit, but this is kind of what I was thinking. I think I addressed all feedback. Thoughts?

useLocalAwareness,
useRemoteAwareness,
} from "@automerge/react"
import { useDocument, usePresence } from "@automerge/react"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I combined the two hooks into one, since they need to communicate (local awareness needs to broadcast when a new peer is seen). The current code does that with a shared global, but that has a bunch of shortcomings: e.g., it could lead to unnecessary messages if working with presence on two unrelated handles (or even repos). We could use React Context instead, but I don't think that's better.

const rootDocUrl = `${document.location.hash.substring(1)}`
const handle = isValidAutomergeUrl(rootDocUrl)
? repo.find(rootDocUrl)
? await repo.find(rootDocUrl)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a bug since #402, I believe 😓 . I plan to add Playwright tests in a follow-up PR to make that kind of thing easier to track down in the future.

const [, setState] = useState(0)
const increment = useCallback(() => setState(value => value + 1), [setState])
return increment
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know Peter had some concerns with this pattern in an earlier contribution, but I still think it's the best way to handle forcing hooks to re-render when mutable objects are updated.

Comment on lines 116 to 119
if (opts?.skipAutoInit) {
return
}
this.initialize()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a really hard time getting React to play nice with Presence's non-pure constructor, especially when in Strict mode. I ended up moving initialization out to a separate method. I don't love that this complicates the interface, but otherwise, Presence initialization in React needs to happen in a useEffect, so we'd need to call the constructor there. Since useEffect happens after a render, that means the initial usePresence render would return undefined, screwing up types and making the interface more complicated.

If this is totally unpalatable, we can instead use a wrapper to initially delegate to a dummy Presence stub and then to a real Presence once initialized, but that complicates the React integration quite a bit (plus any other system with similar constraints). Alternately, we could drop the skipAutoInit option and always require a manual call to initialize. I don't have strong feelings on that. I like that it avoids side effects in the constructor, but it is a more verbose API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if rather than having this skipAutoInit argument, we instead have a static method which returns the Presence instance, as well as a function to initialize it. Something like this:

class Presence<..>{
  // ...
  public static createUnstarted(..): { start: () => void, presence: Presence}
}

Then this would be used in e.g. the usePresence hook like this:

export function usePresence(
  ..
): UsePresenceResult{
  //..
  const {start, presence} = useMemo(() => {
    return Presence.createUnstarted(..)
  }, [..])

  useEffect(() => {
    start()
    presence.on("heartbeat", invalidate)
    presence.on("state", invalidate)
    presence.on("goodbye", invalidate)

constructor(
handle: DocHandle<unknown>,
readonly userId: UserId,
readonly deviceId: DeviceId,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per Peter's suggestion, I'm adding a deviceId: a userId can have many deviceIds, and a deviceId can map to many peerIds (especially over time). Should this be optional?


constructor(
handle: DocHandle<unknown>,
readonly userId: UserId,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OrionReed I forget, does readonly have the same issues as private in TS? Should I handle this differently?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the way you're doing it here makes sense

Comment on lines +242 to +243
// TODO: We currently need to wait for the peer to be ready, but waiting
// some arbitrary amount of time is brittle
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

useLocalAwareness has a similar timeout. I initially tried to leave it out, but peers would miss the initial announcement. I think it's worth tracking that down and fixing it if possible, but I'd like to avoid doing that in this PR.

const heartbeatMs = this.#opts.heartbeatMs ?? DEFAULT_HEARTBEAT_INTERVAL_MS
this.#heartbeatInterval = setInterval(() => {
this.sendHeartbeat()
this.#peers.prune()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also where I prune peers to avoid a separate interval. We could do on-access pruning instead, but this seems simpler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Off the top of my head, this seems fragile. If heartbeats are 'in phase' with the receiver interval, then a tiny amount of network delay could cause a periodic flicker at the heartbeat interval. I think a separate interval is needed, or some multiplier to prune every 2 intervals for example, which could be a simple boolean or integer which is updated/used in the interval

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I follow: even if these are in phase, we expect the ttl to be at least a couple of heartbeat periods (I used your 3× suggestion for the default). You could of course set these to be identical, but that seems like a bad idea (I suppose we could even reject a ttl lower than two heartbeat periods or something like that).

However, I realized this is a bad idea since no heartbeats may be sent if there is constant activity (see broadcastLocalState). I'll update the code to set up a separate interval.

Copy link
Contributor

@OrionReed OrionReed Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I didn't catch the TTL, all good!

* A summary of the latest Presence information for the set of peers who have
* reported a Presence status to us.
*/
export class PeerPresenceView<State> {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm exposing the current peer Presence state through this class, since (especially with users and devices) some things would be awkward to do with just a raw data structure.

}
}

class PeerPresenceInfo<State> extends EventEmitter<PresenceEvents> {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is used internally by Presence to manage the state that's exposed to users through PeerPresenceView above.

import { EventEmitter } from "eventemitter3"

type UserId = unknown
type DeviceId = unknown
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left these as unknown for now, and the way we're using them, I think that's fine, but should these be string instead?

@msakrejda
Copy link
Collaborator Author

After sleeping on this for a bit, I wonder if maybe DocHandle should own its Presence (and expose it as a property). Right now, even though we support channels for different types of presence, we fully replace the presence state received from peers when we get updaets. This means if there are two different parts of the code using two distinct Presence instances with distinct state, they'll be sending conflicting state updates to their peers. In fact, I think Patchwork will run into this.

Thoughts?

Copy link
Contributor

@alexjg alexjg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this seems good to me. I've left a few comments on things which I think would mildy improve the API, but the general design feels well factored.

One concern I do have is that this is a specific idea of what presence is, and it doesn't actually need (as far as I can see) to be bundled in with automerge-repo. I wonder if it would make more sense as a separate package (@automerge/automerge-repo-presence for example). The biggest reason to do that is that if we want to change the awareness code in a backwards incompatible way if it is bundled in then we will have to roll a major version bump for all of automerge-repo. It seems to me though that we would reasonably expect the pace of change for a new presence implementation to be faster than automerge-repo.

* @param opts - see {@link PresenceOpts}
* @returns
*/
constructor(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be worth putting all the arguments in the opts object here instead of individual constructor arguments for two reasons:

  • It makes is easier for us to be forwards compatible when adding new arguments
  • When reading calls to the constructor it's quite hard to distinguish between the userId and deviceId arguments as they are both the same type and semantically similar, I think having effectively a named argument would help there

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was on the fence and then the argument list grew and I forgot to get off the fence. Updated as suggested. It's much nicer now, thanks.

Comment on lines 116 to 119
if (opts?.skipAutoInit) {
return
}
this.initialize()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if rather than having this skipAutoInit argument, we instead have a static method which returns the Presence instance, as well as a function to initialize it. Something like this:

class Presence<..>{
  // ...
  public static createUnstarted(..): { start: () => void, presence: Presence}
}

Then this would be used in e.g. the usePresence hook like this:

export function usePresence(
  ..
): UsePresenceResult{
  //..
  const {start, presence} = useMemo(() => {
    return Presence.createUnstarted(..)
  }, [..])

  useEffect(() => {
    start()
    presence.on("heartbeat", invalidate)
    presence.on("state", invalidate)
    presence.on("goodbye", invalidate)

@msakrejda
Copy link
Collaborator Author

@alexjg thanks for the feedback!

One concern I do have is that this is a specific idea of what presence is, and it doesn't actually need (as far as I can see) to be bundled in with automerge-repo.

What about the point I raised above? With the current design, at least, having two distinct Presence instances attached to a single DocHandle is a recipe for trouble. I'm not sure if there is a design that lets that behave sensibly as long as the presence communication is all happening over ephemeral-message.

But I suppose we could still pull the implementation out and have DocHandle accept a PresenceClass option that conforms to a certain API? It could still be responsible for instantiating Presence for a handle instance.

Heartbeats are stopped when there is "write activity" to the presence,
which could lead to very inconsistent peer pruning.

In passing, make start/stop for heartbeats more robust and handle
intervals the same way for both pruning and heartbeats.
@msakrejda
Copy link
Collaborator Author

I addressed some of the feedback. I also decided to drop the automatic initialization (for now: I'm not pushing for that as a final mechanism, it just makes it easier to explore some API options), and renamed initialize / dispose to start / stop and made sure they are idempotent. I think this is easier to reason about.

I also attempted to implement my idea above re: DocHandle owning Presence to avoid conflicts, but I ran into an issue: then DocHandle either needs to know about all the things the Presence constructor needs: either in its constructor (and anywhere that's called, like Repo#getHandle() and all its callers) or a getPresence() method. The former is clearly not a good idea. The latter means that users could pass a different userId each time they call getPresence() even though the intent is to have that return a persistent Presence object that lives as a member property of the DocHandle. Do we allow those changing parameters? Ignore it? Reject it?

The other thing I'm considering is moving most of those parameters from the Presence constructor to the start() method. That still maintains the dubious feature of changing userId and deviceId in the middle of a session, but otherwise seems generally simpler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants