Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP Presence #2

Closed
wants to merge 17 commits into from
Closed

WIP Presence #2

wants to merge 17 commits into from

Conversation

curran
Copy link
Member

@curran curran commented Apr 10, 2019

@curran curran mentioned this pull request Apr 10, 2019
@curran curran changed the title Presence 2 WIP Presence Apr 10, 2019
@houshuang
Copy link

houshuang commented Apr 10, 2019

I'd love to discuss the basic definition of presence:

{
    u: '123', // user ID
    c: 8, // number of changes made by this user
    s: [ // list of selections
        [ 1, 1 ], // collapsed selection
        [ 5, 7 ], // forward selection
        [ 9, 4 ] // backward selection
    ]
}
  1. Is it important to distinguish between backward and forward selections? I don't even know if the OS distinguishes between those after they've been made (ie. is the interaction different depending on if a selection was made backwards or forwards?). I would be in favor of canonicalizing (is that a word), unless there's a reason not to. In that case, I would also argue for making the final argument length (0 for cursor, 1...length-current for selection).

  2. What is c used for?

  3. For me, it's important to also support sub-types. I don't think ShareDB/ot-json0 has an internal representation of which subtype different leaves have, rather it's the operations that are sub-typed (am I correct?)... Thus, I submit subtype as part of the presence op, and store that, so that when that path needs to be updated, I know which subtype to delegate to.

My idea was to let subtypes define/manage their own formats for presence... For example, Quill uses the c and the s as above, but these don't really represent paths in document... Of course if you did have two Quill selections in the same document, at text, you could represent it as

s: [[ 'text', 3, 4 ], [ 'text', 5, 20 ]], but this is kind of implying that you can index into the document like this, which isn't the case, and also has us calling the Quill transformCursor twice (which probably isn't expensive, but still)... I would prefer something like:

s: [{t: 'rich-text', p: 'text', cursor: { s: [ 3, 4], [5, 20] }]

ot-json0 would see that it needs to call the subtype rich-text, would see if there was already a presence for that path (if not it would generate a blank presence using the subtype), call the subtype's transform cursor with the content of cursor, and return the presence, with the content of cursor replaces by the return value...

(We could imagine other subtypes in the future, or even implementing custom presence types for certain fields - for example I might have a drawing stored as an array of pixels, and want to store presence as a an x,y location).

It also neatly separates out presence at the leaf from cursor/within leaf... For example, I might have a form with many questions (of different types), and be able to put an avatar next to the question they're currently active in... Thus being able to tell at a glance that you are at the question ['text'], or even ['question', 3], and that 3 is not an index into a string that needs to be transformed, would be helpful to me. Maybe this is connected to how I use/think of JSON structures though.

  1. When it comes to presence, two issues.

4a) First, it would be useful to be able to attach user data (let's say full name, and URL to avatar). Of course, this could be furnished externally and just referenced with a uid, but depending on the case, sometimes it's really better to have this sent over the presence-channel... However, it really shouldn't be sent every time there is an update. I was thinking about storing it and only sending it when a new person joins / when someone requests a resend of all presence (using the r flag).

4b) Presence should really be stored at the document level. If we always use ot-json0, we can store it at the "root level", and it will work, but what if we want to have a document that is just an ot-text0, but still attach presence? Or just ot-rich-text? Silly to add this code to all the ottypes... Since this data doesn't require any transformation (just adding when join and removing when leave), it might work just out of the box, but need to make sure that ottypes don't overwrite this... That goes hand in hand with my point on (3) to separate out the general presence from the ottype-specific presence, and just run the transformPresence code on the ottype/document-specific presence...

We could also look at implementing the presence-announcing at a connection level, but I don't think that makes sense - you don't necessarily want to automatically add presence to a document just because you subscribe to it.

@curran
Copy link
Member Author

curran commented Apr 10, 2019

My responses inline:

{
    u: '123', // user ID
    c: 8, // number of changes made by this user
    s: [ // list of selections
        [ 1, 1 ], // collapsed selection
        [ 5, 7 ], // forward selection
        [ 9, 4 ] // backward selection
    ]
}

Is it important to distinguish between backward and forward selections?

I'm not sure. I copied the basic definition from https://github.com/Teamwork/ot-rich-text (by @gkusiba) without fully understanding it. My gut feeling is that, the answer depends on the downstream UI library you'll be integrating with. The user should be able to click at a location and drag either forward or backward, but I agree with you that as far as presence is concerned, it should be doable to convert all backwards selections to forward ones at the time they are set as presence.

What is c used for?

I honestly have no idea at all. It's from @gkusiba

For me, it's important to also support sub-types. I don't think ShareDB/ot-json0 has an internal representation of which subtype different leaves have, rather it's the operations that are sub-typed (am I correct?)... Thus, I submit subtype as part of the presence op, and store that, so that when that path needs to be updated, I know which subtype to delegate to.

Yes, I discovered a bit more today about how this works with json0. There's sort of an "outer op", which contains the path, and an "inner op" whose structure is dependent on the subtype. So far my implementation is hardcoded to detect only text0 subtypes, but I do want to make it generic, such that the subtypes can handle their own presence.

This is a challenging area to think about, because I think the only reasonable solution would be to have the json0 presence data structure be able to contain "inner presence" for certain paths. In my implementation so far, the json0 presence is convoluted with parts of presence that should in theory belong to text0.

My idea was to let subtypes define/manage their own formats for presence...

I would like to work more on this, so that text0 defines its own kind of "inner presence", independently from json0, then the json0 presence type could be made generic and able to contain, for example, the rich-text-ot presence that you are using. I definitely do want to address that case, which your implementation already does. I would like to add tests that cover this case (e.g. that pull in the rich text type), then work to make them pass. Is that something you'd be interested in working on?

s: [[ 'text', 3, 4 ], [ 'text', 5, 20 ]]
s: [{t: 'rich-text', p: 'text', cursor: { s: [ 3, 4], [5, 20] }]

It's a great idea. I think we need to design some unit tests to verify the best design.

it would be useful to be able to attach user data (let's say full name, and URL to avatar). Of course, this could be furnished externally and just referenced with a uid, but depending on the case, sometimes it's really better to have this sent over the presence-channel...

IMO the better approach is that this can be furnished externally. Downstream UI code can be responsible. All the presence feature needs to do is provide the uids.

Presence should really be stored at the document level. If we always use ot-json0, we can store it at the "root level", and it will work, but what if we want to have a document that is just an ot-text0, but still attach presence? Or just ot-rich-text? Silly to add this code to all the ottypes...

Yes, I'm not sure. I think all the ottypes will need to have the presence interface functions defined, in order to use presence with them. Maybe there could be some sort of a default stub, in the case the type doesn't have presence functions defined. Not sure what this would look like.

We could also look at implementing the presence-announcing at a connection level, but I don't think that makes sense - you don't necessarily want to automatically add presence to a document just because you subscribe to it.

It's an option, definitely. I think @gkusiba has thought deeply about this. Have you seen this work of his? https://github.com/SyncOT/SyncOT Quite interesting.

@gkubisa
Copy link

gkubisa commented Apr 10, 2019

{
    u: '123', // user ID
    c: 8, // number of changes made by this user
    s: [ // list of selections
        [ 1, 1 ], // collapsed selection
        [ 5, 7 ], // forward selection
        [ 9, 4 ] // backward selection
    ]
}

Is it important to distinguish between backward and forward selections?

ot-rich-text is specifically designed for rich text editing where the direction of the selection does matter. It affects the UX for the user making the selection and determines at which end of the selection a caret should be displayed.

What is c used for?

When a user makes a change to an ot-rich-text document, she increments the c value. When other users receive a presence from that user with the changed c value, they know that she's just made a change.

Ideally it would not be needed as each operation has an associated src property, which says which user submitted it. Unfortunately, ShareDB does not expose that value to the client code, hence the c workaround in the ot-rich-text presence.

We could also look at implementing the presence-announcing at a connection level, but I don't think that makes sense - you don't necessarily want to automatically add presence to a document just because you subscribe to it.

It's an option, definitely. I think @gkusiba has thought deeply about this. Have you seen this work of his? https://github.com/SyncOT/SyncOT Quite interesting.

A bit of background: SyncOT grew out of my frustration with some core design decisions underpinning ShareDB when applied to my specific use case - rich text document editing. Both libraries offer somewhat different features and compromises, so pick the one which works better for you. (At the time of writing, SyncOT is still WIP.)

SyncOT will have separate global and document presence.

The global presence will tell you who is currently online (userId and sessionId), where they are (locationId), when the presence was last updated (lastModified) and what's the optional, arbitrary data associated with this presence (data). The global presence is transient and automatically disappears when a user disconnects.

The document presence will contain a sessionId and any necessary document-specific data, eg selections in the case of a rich text document. The document presence is persistent and stored together with snapshots and operations. To keep performance at an acceptable level, SyncOT will have a greatly optimized operation and snapshot storage system.

Having both global and document presence results in a significantly simpler and more robust system. It will also make integration of presence and undo/redo trivial.

So, overall I think that "announcing" global presence at the connection level is a good idea, however, more thought would be needed to decide how and if it fits with current design of ShareDB.

Why ot-rich-text supports multiple selections?

The main reason is table selections... when you select a column, you end up with one selection per cell and those selections can't be coalesced in the ot-rich-text data model. Additionally, it allows encoding multiple selections supported by Firefox.

General thoughts about json0 presence

Delegating presence to sub-types seems to be the most obvious and generally useful feature of json0 presence. Anything else risks being use case specific and might be best done outside the core type, eg define a new type which wraps json0 and adds presence specific to working with an array of pixels, a DOM tree, another custom data model, etc.

@houshuang
Copy link

Thanks for all your thoughts. I'll comment a bit more later, for now, just wanted to quickly show what I've been working on: https://youtu.be/BhCI_KDYWfk

This is showing presence across rich text (quill), text area and text input, and a boolean field (where it's just keeping track of the path).

For the text area/text input UI, I'm using https://github.com/convergencelabs/html-text-collab-ext. There are a few tiny issues with text input (when you're out of bounds) which I'm fixing.

I haven't implemented transformCursor yet, so for now it's just sending presence on every keystroke, which works (and the library above actually does some transformation of selections internally I think), but I will improve this.

You see how I model the presence - basically each user can only be in one place at one time. Very simplified but for our purposes OK.

On the one hand, we need this functionality in our product yesterday, so I'm moving forwards with testing this and integrating it, while also interested in contributing to a more well-thought out and stable approach, and then integrating that into our product when it becomes available.

@curran
Copy link
Member Author

curran commented Apr 11, 2019

@houshuang Based on your suggestion, I revised the high level structure for presence here (just in README so far) and I'd like to get your feedback. What you think of the following proposed structure for json0 presence:

{
  user: '123', // User ID.
  changes: 8,  // Number of changes made by this user (for change detection).
  presence: [  // List of sub-presence objects, per OT type.
    {
      type: 'rich-text',      // The OT type for this presence object.
      path: ['some', 'path'], // The path of this presence object.
      subPresence: {          // The type-specific presence object at this path.
        u: '123', c: 8,       // An example of an ot-rich-text presence object.
        s: [ [ 1, 1 ], [ 5, 7 ], [ 9, 4 ] ]
      }
    },
    {
      type: 'text0',          // An example of a text0 presence object.
      path: ['some', 'other', 'path'],
      subPresence: {
        u: '123', c: 8,
        s: [ [ 1, 1 ], [ 5, 7 ], [ 9, 4 ] ]
      }
    }
  ]
}

?

@curran
Copy link
Member Author

curran commented Apr 11, 2019

I wonder if it might make sense to represent the presence using the structure of the paths, rather than as an array, like this:

{
  user: '123', // User ID.
  changes: 8,  // Number of changes made by this user (for change detection).
  presence: {  // Tree of sub-presence objects, per OT type.
    some: { 
      path: { 
        type: 'rich-text',      // The OT type for this presence object.
        subPresence: {          // The type-specific presence object at this path.
          u: '123', c: 8,       // An example of an ot-rich-text presence object.
          s: [ [ 1, 1 ], [ 5, 7 ], [ 9, 4 ] ]
        }
      },
      other: {
        path: { 
          type: 'text0',          // An example of a text0 presence object.
          subPresence: { 
            u: '123', c: 8,
            s: [ [ 1, 1 ], [ 5, 7 ], [ 9, 4 ] ]
          }
        }
      }
    }
  }
}

Before doing the implementation, it's not clear to me which approach would be better.

An interesting question: do we really need multiple entries? Originally I was considering the case that a user has two text editors open at a time, e.g. in a split pane, each at a different path. But conceptually, the user can't be two places at once. So maybe my original idea of multiple entries is overly complex.

@curran
Copy link
Member Author

curran commented Apr 11, 2019

Extrapolating from this implementation by @houshuang in https://github.com/ottypes/json0/pull/25/files#diff-875c48e604c73336a76d5b063e16069cR184

    presence = {
      ...subtypes[opType].transformPresence(presence, op, isOwn),
      p: op[0].p,
      t: op[0].t
    };

, the presence structure here would look like this I think:

{
  p: ['some', 'path'], // The path of this presence object.
  t: 'ot-rich-text',   // The OT type for this presence object.

  u: '123',            // Fields derived from ot-rich-text (u, c, s).
  c: 8,
  s: [ [ 1, 1 ], [ 5, 7 ], [ 9, 4 ] ],
}

This is nice and simple, and has the "one place at one time" assumption built-in. Generally, this approach restricts presence from other types such that they must avoid the field names p and t. Also, it's not clear which OT type is responsible for the u field (json0 or the subtype).

Perhaps a better approach would be to store the "sub-presence" within an object, like this:

{
  p: ['some', 'path'], // The path of this presence object.
  t: 'ot-rich-text',   // The OT type for this presence object.
  s: {                 // Sub-presence object, specific to the type.
    u: '123',            
    c: 8,
    s: [ [ 1, 1 ], [ 5, 7 ], [ 9, 4 ] ]
  }
}

Note that the u property is contained in the sub-presence, and not at the top level. I don't have a clear view of the tradeoff landscape yet, but my first impression is that it makes sense to allow the sub-presence to manage u.

@houshuang
Copy link

First of all, thinking about writing the code to process these, I think the first approach is better - when receiving an object, you always look at the presences and look up the paths to set, you don't have a path in mind and look up the presence there (I think?)...

I still don't quite get how the c: variable works, I need to look more at the code. In general, it would be really useful to write up a little narrative of how this data is processed, because even though I've been working with it a bit, I am still fuzzy on some areas, like which functions in the ottype are run on the server and which on the client, etc. I can try to write up my understanding, and have others add to it.

I still believe it would be a really nice option to store more userinfo in the presence. In our case, the app is frontend only, the only backend is ShareDB, and new users might join at a later time. Currently we'll be storing the user info (name, avatar etc) ,as part of the shareDB document, but it would have been nice to keep it as part of the presence.

In your example above, you are duplicating u and c in the subtypes. I understand that's probably to have the transformPresence functions work regardless of whether the subtype presence is the only presence, or it's embedded in an ot-json0... I was thinking of always keeping the presence (which I called cursor) in a separate document, and have that passed to transformPresence, whether a subtype called directly, or called from ot-json0... But I didn't quite wrap this work up.

I'm still a bit concerned about the concept of multiple presences. The problem is that it requires very precise tracking of when someone leaves a given field, which can often be quite difficult, and for some cases (multiple-choice question), the definition of leaving is beginning to edit somewhere else... Whereas it's always easy to detect when the user reports presence from another path... So for my use case, this is a better solution - but I completely understand the need for more complexity, like the case of a spreadsheet with multiple columns selected, or even just supporting Firefox multi-select...

@houshuang
Copy link

Note that I wrote the above before reading your last post (written contemporaneously) :)

@curran
Copy link
Member Author

curran commented Apr 11, 2019

Looks like we're both eager to find a solution!

Regarding c, my gut feeling is to leave it out until the need for it is clear. In a way I can understand it's usefulness in change detection, but probably change detection could happen in other ways as well, so I'm not convinced it's critical.

I suppose if you want to implement a feature like "Make the user avatar blink when they make a change", you'd need c.

@houshuang
Copy link

{
  p: ['some', 'path'], // The path of this presence object.
  t: 'ot-rich-text',   // The OT type for this presence object.
  s: {                 // Sub-presence object, specific to the type.
    u: '123',            
    c: 8,
    s: [ [ 1, 1 ], [ 5, 7 ], [ 9, 4 ] ]
  }
}

This is pretty close to how I do it in my latest implementation:

{ 
  cursor: { 
    c: 0, 
    s: [[53, 53]] 
  }, 
  p: ['example'], 
  t: 'rich-text'
  u: 'Ananda' 
};

I don't see the point of handing `u` to the subtype - it would be the same `u` for all presences in a single presence object, right? Unless the subtype needs access to the `u` to determine how to transform, but I think it just needs the `isOwn` property passed along? 

@curran
Copy link
Member Author

curran commented Apr 11, 2019

One thing to consider regarding u is that the ot-rich-text considers a presence object without it invalid. Here's the bit of code that checks for it: https://github.com/Teamwork/ot-rich-text/blob/master/src/Operation.js#L474

@gkubisa
Copy link

gkubisa commented Apr 11, 2019

@curran

Regarding c, my gut feeling is to leave it out until the need for it is clear. In a way I can understand it's usefulness in change detection, but probably change detection could happen in other ways as well, so I'm not convinced it's critical.

In many cases it likely is not needed and yes, it should happen in a different way - in my opinion by modifying ShareDB to report the src in op events, even if src is remote. Obviously this change would not be backward compatible.

I suppose if you want to implement a feature like "Make the user avatar blink when they make a change", you'd need c.

That's exactly what I use c for. :-)

@curran
Copy link
Member Author

curran commented Apr 11, 2019

Aha! Very cool. It's a great feature idea.

I had an idea to make sounds (like the ones computers make in movies) when users perform operations. Each user would have their own unique pitch or sound quality that you could immediately identify. This could be implemented with c as well.

@houshuang
Copy link

I wrote up a little document of my understanding of the "dance" between different entities with presence... This is basically a restating of the original @Teamwork sharedb PR, but I found it useful to try to restate my understanding in my own words, it might also expose any misunderstanding I have. Perhaps it can be a useful document for the community once we clarify it. (Then we can also post it somewhere else, put it on GDocs for now): https://docs.google.com/document/d/1lxBa74_sfkhaLg0s7S_8yIcboaNZvE6TNHtUYnhAlNc/edit#

@gkubisa
Copy link

gkubisa commented Apr 11, 2019

@houshuang

I don't see the point of handing u to the subtype - it would be the same u for all presences in a single presence object, right? Unless the subtype needs access to the u to determine how to transform, but I think it just needs the isOwn property passed along?

I'd avoid making any assumptions about the structure of subtypes to maximize flexibility, just like json0 does not make any assumptions about the structure of the subtype snapshots and operations.

To be honest, both u and c in ot-rich-text are workarounds. c is for change detection, as explained earlier in this thread. u would not be needed, if ShareDB had connection level presence capable of mapping src to userId, and any other properties for that matter, like avatar URL, full name, etc. (SyncOT obviously won't have those limitations. ;-) )

@curran
Copy link
Member Author

curran commented Apr 12, 2019

This PR has too many incidental refactoring-style changes, which I think complicate things.

Closing in favor of ottypes#31, which is purely additive, with no extraneous refactorings.

@curran curran closed this Apr 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants