Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible inconsistent state between federated room and remote room if a cloud notification is retried #13079

Closed
danxuliu opened this issue Aug 22, 2024 · 0 comments · Fixed by #13422 · May be fixed by #13163
Closed

Possible inconsistent state between federated room and remote room if a cloud notification is retried #13079

danxuliu opened this issue Aug 22, 2024 · 0 comments · Fixed by #13422 · May be fixed by #13163
Assignees
Labels
1. to develop bug feature: api 🛠️ OCS API for conversations, chats and participants feature: federation 🌐
Milestone

Comments

@danxuliu
Copy link
Member

How to use GitHub

  • Please use the 👍 reaction to show that you are affected by the same issue.
  • Please don't comment if you have no relevant information to add. It's just extra noise for everyone subscribed to this issue.
  • Subscribe to receive notifications on status change and new comments.

If the cloud notification to update a room property sent from the host server to the federated server fails for some reason it will be retried later through a background job. However, if the same property was modified again in the meantime and that notification was successfully sent when the background job sends the previous value again it will overwrite the current value.

Although syncing the room properties when joining it mitigates the problem it does not fully solve it.

A possible solution would be to add a property to rooms that keep track on how many times it was modified, so if a cloud notification is resent but the federated room is already in a newer state it is ignored.

An unsigned 32 bit integer can have 4294967296 values. If a property was changed every second and thus the counter was increased by 1 every second that would make 4294967296/(606024*365) = 136 years, so... I guess that an integer should be enough, even if it is signed and we start at 0 ;-)

This would also require sending the full state on each notification; otherwise, as right now only each single property modification is notified, some state could be lost if a cloud notification is ignored.

A mixed approach between the current behaviour (sending a delta without a modification version) and the fixed behaviour (sending the full state with a modification version) would be sending a delta with a modification version, and if the federated server receives a cloud notification with a modification version higher than current version + 1 then it would explicitly request the full state to the remote server and set it. But I am not sure if there could be some race condition with that approach.

In any case, note that full state does not necessarily mean all the room properties; it could be limited only to those actually needed in federated rooms (like done when joining the room).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment