From 1ced6fb0b39bd18bd154d220ee80fd4e97b2459d Mon Sep 17 00:00:00 2001 From: Matthew Hodgson Date: Fri, 22 Nov 2024 00:19:16 +0000 Subject: [PATCH 1/6] MSC4231: Backwards compatibility for media captions --- ...1-media-caption-backwards-compatibility.md | 58 +++++++++++++++++++ 1 file changed, 58 insertions(+) create mode 100644 proposals/4231-media-caption-backwards-compatibility.md diff --git a/proposals/4231-media-caption-backwards-compatibility.md b/proposals/4231-media-caption-backwards-compatibility.md new file mode 100644 index 00000000000..09b0fbac8e4 --- /dev/null +++ b/proposals/4231-media-caption-backwards-compatibility.md @@ -0,0 +1,58 @@ +# MSC4231: Backwards compatibility for media captions + +## Problem + +MSC2530 introduced the ability to use the `body` field on file transfers as a caption. This merged and was shipped +in Matrix 1.10, and we're now seeing more clients sending captions in the wild. + +Unfortunately, any client which is not "caption-aware" (i.e. has yet to implement MSC2530 or Matrix 1.10) does not know +to display the `body` field as a caption - and so these messages effectively get silently dropped, fragmenting Matrix +as a communication medium. Given captions typically contain as much important information as any other message, this +can result in bad communication failures, and a very negative perception of Matrix's reliability. + +We should have specified a means of backwards compatibility to avoid breaking communication between newer and older +clients during the window in which we wait for clients to upgrade to Matrix 1.10. + +## Proposal + +Clients should send a separate `m.room.message` event after the captioned media, including the caption as the body. + +The content block of this mesage also includes an `m.caption_fallback: true` field, so that caption-aware clients do not +display this event, instead displaying the media event's `body` field as a caption per MSC2530. + +However, caption-unaware clients will display the event and so avoid discarding the contents of the caption. + +## Potential issues + +It's a bit ugly and redundant to duplicate the caption in the fallback event as well as the media event. However, it's +way worse to drop messages. + +The fact that caption fallback events will be visible to some clients and invisible to others might highlight unread +state/count problems. However, given we need to handle invisible events already, it's not making the problem worse - +and in fact by making it more obvious, might help fix any remaining issues in implementations. + +## Alternatives + +Captions should be provided by extensible events. However, until extensible events are fully rolled out, we're stuck +with fixing up the situation with MSC2530. + +Alternatively, we could ignore the issue and go around upgrading as many clients as possible to speak MSC2530. However, +this feels like incredibly bad practice, given we have a trivial way to provide backwards compatibility, and in +practice we shouldn't be forcing clients to upgrade in order to avoid losing messages when we could have avoided it in +the first place. + +## Security considerations + +The caption in the fallback may not match the caption in the media event, causing confusion between caption-aware and +caption-unaware clients. + +Sending two events (media + caption) in quick succession will make event-sending rate limits kick in more rapidly. In +practice this feels unlikely to be a problem. + +## Unstable prefix + +`m.caption_fallback` would be `org.matrix.msc4231.caption_fallback` until this merges. + +## Dependencies + +None, given MSC2530 has already merged. From fe9ef9ea07a4372314e3765aece243e0f55e741a Mon Sep 17 00:00:00 2001 From: Matthew Hodgson Date: Fri, 22 Nov 2024 00:20:49 +0000 Subject: [PATCH 2/6] typoe --- proposals/4231-media-caption-backwards-compatibility.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/4231-media-caption-backwards-compatibility.md b/proposals/4231-media-caption-backwards-compatibility.md index 09b0fbac8e4..48ef03bd4be 100644 --- a/proposals/4231-media-caption-backwards-compatibility.md +++ b/proposals/4231-media-caption-backwards-compatibility.md @@ -17,7 +17,7 @@ clients during the window in which we wait for clients to upgrade to Matrix 1.10 Clients should send a separate `m.room.message` event after the captioned media, including the caption as the body. -The content block of this mesage also includes an `m.caption_fallback: true` field, so that caption-aware clients do not +The content block of this message also includes an `m.caption_fallback: true` field, so that caption-aware clients do not display this event, instead displaying the media event's `body` field as a caption per MSC2530. However, caption-unaware clients will display the event and so avoid discarding the contents of the caption. From 0b77189f78ec0e6a64750c8d141c25391950f077 Mon Sep 17 00:00:00 2001 From: Matthew Hodgson Date: Fri, 22 Nov 2024 00:26:49 +0000 Subject: [PATCH 3/6] links --- ...1-media-caption-backwards-compatibility.md | 32 +++++++++++-------- 1 file changed, 18 insertions(+), 14 deletions(-) diff --git a/proposals/4231-media-caption-backwards-compatibility.md b/proposals/4231-media-caption-backwards-compatibility.md index 48ef03bd4be..2c66b3d37f6 100644 --- a/proposals/4231-media-caption-backwards-compatibility.md +++ b/proposals/4231-media-caption-backwards-compatibility.md @@ -2,13 +2,15 @@ ## Problem -MSC2530 introduced the ability to use the `body` field on file transfers as a caption. This merged and was shipped -in Matrix 1.10, and we're now seeing more clients sending captions in the wild. +[MSC2530](https://github.com/matrix-org/matrix-spec-proposals/pull/2530) introduced the ability to use the `body` field +on file transfers as a caption. This merged and was shipped in Matrix 1.10, and we're now seeing more clients sending +captions in the wild. -Unfortunately, any client which is not "caption-aware" (i.e. has yet to implement MSC2530 or Matrix 1.10) does not know -to display the `body` field as a caption - and so these messages effectively get silently dropped, fragmenting Matrix -as a communication medium. Given captions typically contain as much important information as any other message, this -can result in bad communication failures, and a very negative perception of Matrix's reliability. +Unfortunately, any client which is not "caption-aware" (i.e. has yet to implement +[MSC2530](https://github.com/matrix-org/matrix-spec-proposals/pull/2530) or Matrix 1.10) does not know to display the +`body` field as a caption - and so these messages effectively get silently dropped, fragmenting Matrix as a +communication medium. Given captions typically contain as much important information as any other message, this can +result in bad communication failures, and a very negative perception of Matrix's reliability. We should have specified a means of backwards compatibility to avoid breaking communication between newer and older clients during the window in which we wait for clients to upgrade to Matrix 1.10. @@ -17,8 +19,9 @@ clients during the window in which we wait for clients to upgrade to Matrix 1.10 Clients should send a separate `m.room.message` event after the captioned media, including the caption as the body. -The content block of this message also includes an `m.caption_fallback: true` field, so that caption-aware clients do not -display this event, instead displaying the media event's `body` field as a caption per MSC2530. +The content block of this message also includes an `m.caption_fallback: true` field, so that caption-aware clients do +not display this event, instead displaying the media event's `body` field as a caption per +[MSC2530](https://github.com/matrix-org/matrix-spec-proposals/pull/2530). However, caption-unaware clients will display the event and so avoid discarding the contents of the caption. @@ -34,12 +37,13 @@ and in fact by making it more obvious, might help fix any remaining issues in im ## Alternatives Captions should be provided by extensible events. However, until extensible events are fully rolled out, we're stuck -with fixing up the situation with MSC2530. +with fixing up the situation with [MSC2530](https://github.com/matrix-org/matrix-spec-proposals/pull/2530), and this is +a problem which is playing out right now on the public network. -Alternatively, we could ignore the issue and go around upgrading as many clients as possible to speak MSC2530. However, -this feels like incredibly bad practice, given we have a trivial way to provide backwards compatibility, and in -practice we shouldn't be forcing clients to upgrade in order to avoid losing messages when we could have avoided it in -the first place. +Alternatively, we could ignore the issue and go around upgrading as many clients as possible to speak +[MSC2530](https://github.com/matrix-org/matrix-spec-proposals/pull/2530). However, this feels like incredibly bad +practice, given we have a trivial way to provide backwards compatibility, and in practice we shouldn't be forcing +clients to upgrade in order to avoid losing messages when we could have avoided it in the first place. ## Security considerations @@ -55,4 +59,4 @@ practice this feels unlikely to be a problem. ## Dependencies -None, given MSC2530 has already merged. +None, given [MSC2530](https://github.com/matrix-org/matrix-spec-proposals/pull/2530) has already merged. From e7dae34b3ba1eb1e591b57f0045fd4a8254eadf6 Mon Sep 17 00:00:00 2001 From: Matthew Hodgson Date: Sun, 24 Nov 2024 20:11:20 +0000 Subject: [PATCH 4/6] write up complications around edits & redactions --- ...1-media-caption-backwards-compatibility.md | 66 +++++++++++++++++-- 1 file changed, 61 insertions(+), 5 deletions(-) diff --git a/proposals/4231-media-caption-backwards-compatibility.md b/proposals/4231-media-caption-backwards-compatibility.md index 2c66b3d37f6..24f6eae63e2 100644 --- a/proposals/4231-media-caption-backwards-compatibility.md +++ b/proposals/4231-media-caption-backwards-compatibility.md @@ -17,13 +17,54 @@ clients during the window in which we wait for clients to upgrade to Matrix 1.10 ## Proposal -Clients should send a separate `m.room.message` event after the captioned media, including the caption as the body. +Clients should send a separate `m.room.message` event after the captioned media, including the caption as the body, +and replying to the media event. This is referred to as a caption fallback event. -The content block of this message also includes an `m.caption_fallback: true` field, so that caption-aware clients do -not display this event, instead displaying the media event's `body` field as a caption per +The content block of the caption fallback event includes an `m.caption_fallback: true` field, so that caption-aware +clients do not display this event, instead displaying the media event's `body` field as a caption per [MSC2530](https://github.com/matrix-org/matrix-spec-proposals/pull/2530). -However, caption-unaware clients will display the event and so avoid discarding the contents of the caption. +However, caption-unaware clients will display the event as a reply to the media and so avoid discarding the contents of +the caption, while associating it visually with the original media via the reply. + +If a user on a caption-aware client edits their caption, their client should update both the media event and the caption +fallback with the edit. + +If a user on a caption-aware client redacts their media, their client should redact its caption fallback too. + +If a user on a caption-unaware client edits or redacts a caption fallback sent on a caption-aware client, then the +fallback will drift out of sync with the caption on the media event - see Outstanding Issues below. + +The event contains an `m.relates_to` field of type `m.caption_fallback` in order to associate the fallback to the media +event, and so make it easy to locate when a caption-aware client applies edits or redactions. This also stops clients +trying to start threads from the caption fallback, as the server will reject the invalid thread. The end result looks +like this: + +```json + "type": "m.room.message", + "content": { + "body": "Caption text", + "msgtype": "m.text", + "m.relates_to": { + "event_id": "$(some image event)", + "rel_type": "m.caption_fallback", + "m.in_reply_to": { + "event_id": "$OYKwuL..." + }, + } + }, + ``` + +If non-caption-aware users reply to a caption fallback, then caption-aware clients should display the media event +as the event being replied to. + +## Outstanding issues + +If a user on a caption-unaware client edits a caption fallback sent on a caption-aware client, then this change +will not be visible to caption-aware clients, causing inconsistent history between caption-aware and unaware clients. + +If a user on a caption-unaware client redacts a caption fallback sent on a caption-aware client, then the caption in +the media event won't be redacted, potentially leaking the redacted content. ## Potential issues @@ -45,10 +86,25 @@ Alternatively, we could ignore the issue and go around upgrading as many clients practice, given we have a trivial way to provide backwards compatibility, and in practice we shouldn't be forcing clients to upgrade in order to avoid losing messages when we could have avoided it in the first place. +This has ended up combining both [MSC2530](https://github.com/matrix-org/matrix-spec-proposals/pull/2530) and +[MSC2529](https://github.com/matrix-org/matrix-spec-proposals/pull/2529). There's a world where the fallback event could +be the primary source of truth for the caption, and meanwhile the field on the media event be the 'fallback' for the +convenience of bridges. + +Alternatively, we could change to sending captions entirely as relations, as in +[MSC2529](https://github.com/matrix-org/matrix-spec-proposals/pull/2529), and require bridges to wait for the caption +event (if flagged on the media event) before they send on the media event. This would avoid needing a dedicated +caption fallback event - as the caption would have its own event anyway. It would also avoid the risk of edits +and redactions getting out of sync between the media event and the caption fallback. **This feels like it might +be a preferable approach, given the outstanding issues above**. It does however travel in the opposite direction to +extensible events (where the caption would be a mixin on the media event). + ## Security considerations The caption in the fallback may not match the caption in the media event, causing confusion between caption-aware and -caption-unaware clients. +caption-unaware clients. From a trust & safety perspective, the caption in the fallback might contain abusive content +not visible to human moderators because their caption-aware clients hide the fallback (and vice versa, for +caption-unaware clients). Sending two events (media + caption) in quick succession will make event-sending rate limits kick in more rapidly. In practice this feels unlikely to be a problem. From 42b02301ea343b92624a5117e4bd003470cff62b Mon Sep 17 00:00:00 2001 From: Matthew Hodgson Date: Sun, 24 Nov 2024 20:11:55 +0000 Subject: [PATCH 5/6] md fix --- proposals/4231-media-caption-backwards-compatibility.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/4231-media-caption-backwards-compatibility.md b/proposals/4231-media-caption-backwards-compatibility.md index 24f6eae63e2..d58fb9bdea5 100644 --- a/proposals/4231-media-caption-backwards-compatibility.md +++ b/proposals/4231-media-caption-backwards-compatibility.md @@ -53,7 +53,7 @@ like this: }, } }, - ``` +``` If non-caption-aware users reply to a caption fallback, then caption-aware clients should display the media event as the event being replied to. From 6098e06df2c3e3ffd48701be39eb59b363d8519c Mon Sep 17 00:00:00 2001 From: Matthew Hodgson Date: Sun, 24 Nov 2024 20:12:56 +0000 Subject: [PATCH 6/6] add manu concer --- proposals/4231-media-caption-backwards-compatibility.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/proposals/4231-media-caption-backwards-compatibility.md b/proposals/4231-media-caption-backwards-compatibility.md index d58fb9bdea5..8ffb1c9cd74 100644 --- a/proposals/4231-media-caption-backwards-compatibility.md +++ b/proposals/4231-media-caption-backwards-compatibility.md @@ -66,6 +66,9 @@ will not be visible to caption-aware clients, causing inconsistent history betwe If a user on a caption-unaware client redacts a caption fallback sent on a caption-aware client, then the caption in the media event won't be redacted, potentially leaking the redacted content. +Clients or bridges that are caption-aware but not MSC4231-aware capable will display or transport the text content +twice, displaying double content to the user. + ## Potential issues It's a bit ugly and redundant to duplicate the caption in the fallback event as well as the media event. However, it's