-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix join being denied after being invited over federation #18075
base: develop
Are you sure you want to change the base?
Fix join being denied after being invited over federation #18075
Conversation
Reproduction test for element-hq/synapse#18075
Reproduction test for element-hq/synapse#18075
# You can only join the room if you are invited or are already in the room. | ||
if not (caller_in_room or caller_invited): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is equivalent to before but I found this easier to grok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the same via De Morgan's theorem.
This reverts commit a32c1ba.
This is just normal for how someone finds out about an invite over federation See #18075 (comment)
synapse/handlers/federation_event.py
Outdated
# After persistence we always need to notify replication there may | ||
# be new data. | ||
self._notifier.notify_replication() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed as this already happens in _notify_persisted_event
-> on_new_room_events
-> notify_new_room_events
-> notify_replication
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess there is one difference that this happens for every event (backfilled or not) 🤔
(vs we only call _notify_persisted_event
-> notify_replication
for non-backfilled events)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've restored this call for now but I've left a TODO
to fill in the "why" we have to do this.
We never notify clients about backfilled events but it's important to let all the workers know about any new event (backfilled or not) because TODO
Doesn't work very well because we need a lot of interaction with another homeserver
Fix join being denied after being invited over federation. This probably also happens for rejecting an invite or trying to accept/deny someones knock request. Basically, any out-of-band membership transition where we first get the membership as an
outlier
and then rely on federation filling us in to de-outlier it.I initially thought this might be a Synapse consistency issue (see issues labeled with
Z-Read-After-Write
) but it seems to be an event auth logic problem. Workers probably do increase the number of possible race condition scenarios that make this visible though (replication and cache invalidation lag).Note
The fix hasn't been implemented yet (just an investigation)
Probably fixes matrix-org/synapse#15012 (#15012)
Problems:
event_auth
logic even though we expose them in/sync
.What happened before?
I wrote a Complement test that stresses this exact scenario and reproduces the problem: matrix-org/complement#754
We have
hs1
andhs2
running in monolith mode (no workers):@charlie1:hs2
is invited and joins the room:hs1
invites@charlie1:hs2
to a room which we receive onhs2
asPUT /_matrix/federation/v1/invite/{roomId}/{eventId}
(on_invite_request(...)
) and the invite membership is persisted as an outlier. Theroom_memberships
andlocal_current_membership
database tables are also updated which means they are visible down/sync
at this point.@charlie1:hs2
decides to join because it saw the invite down/sync
. Becausehs2
is not yet in the room, this happens as a remote joinmake_join
/send_join
which comes back with all of the auth events needed to auth successfully and now@charlie1:hs2
is successfully joined to the room.@charlie2:hs2
is invited and and tries to join the room:hs1
invites@charlie2:hs2
to the room which we receive onhs2
asPUT /_matrix/federation/v1/invite/{roomId}/{eventId}
(on_invite_request(...)
) and the invite membership is persisted as an outlier. Theroom_memberships
andlocal_current_membership
database tables are also updated which means they are visible down/sync
at this point.hs2
is already participating in the room, we also see the invite come over federation in a transaction and we start processing it (not done yet, see below)@charlie2:hs2
decides to join because it saw the invite down/sync
. Becausehs2
, is already in the room, this happens as a local join but we deny the event because ourevent_auth
logic thinks that we have no membership in the room ❌ (expected to be able to join because we saw the invite down `/sync)@charlie2:hs2
invite event from and de-outlier it.Logs for
hs2
:Dev notes
Other unrelated but semi-related races:
send_join
races with local users sending messages #17720Pull Request Checklist
EventStore
toEventWorkerStore
.".code blocks
.(run the linters)