Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[prosemirror-view] Proposal: Add transformCopiedHTML callback #1447

Open
ashu12chi opened this issue Feb 19, 2024 · 7 comments
Open

[prosemirror-view] Proposal: Add transformCopiedHTML callback #1447

ashu12chi opened this issue Feb 19, 2024 · 7 comments

Comments

@ashu12chi
Copy link

Hi,

This is duplicate of this discussion.

We have custom styles that I want to append to the html content as <style> tag when it’s copied from editor. We also want to add a <meta> tag, so that it indicates that the source of content currently in clipboard is from my editor component along with other meta data.

We want a callback similar to transformPastedHTML which can have following signature:

transformCopiedHTML?: (this: P, html: HTMLDivElement, view: EditorView) => HTMLDivElement

This callback can be added here and it will help us in manipulating the copied HTML just before adding it to the clipboard without dealing with custom ClipboardSerializer.

@marijnh
Copy link
Member

marijnh commented Feb 19, 2024

What is the problem with using a custom ClipboardSerializer?

@ashu12chi
Copy link
Author

There are multiple reasons due to which we don't want to use custom ClipboardSerializer:

  1. Our use case is simple which requires the manipulation on the final generated HTML and we does not want to interfere with the default clipboard serialization of ProseMirror.
  2. The ProseMirror performs further operation on the obtained document fragment (ref), and it this final html that we want to transform before it’s passed to clipboard.

Additionally, the use case we want to handle here should be fairly common e.g. When we copy any content from MS Office apps like MS Word or MS Excel, it also add <style> and <meta> tags.

@marijnh
Copy link
Member

marijnh commented Feb 20, 2024

Our use case is simple which requires the manipulation on the final generated HTML and we does not want to interfere with the default clipboard serialization of ProseMirror.

You can call to the default serializer, and then do your own transformations on the result.

The ProseMirror performs further operation on the obtained document fragment (ref), and it this final html that we want to transform before it’s passed to clipboard.

That just adds an attribute, one that you shouldn't be interfering with anyway.

Additionally, the use case we want to handle here should be fairly common e.g. When we copy any content from MS Office apps like MS Word or MS Excel, it also add <style> and tags.

If you copy from another app, I don't see how ProseMirror's copying behavior is even involved.

@apaar97
Copy link

apaar97 commented Feb 21, 2024

Hi @marijnh

Additionally, the use case we want to handle here should be fairly common e.g. When we copy any content from MS Office apps like MS Word or MS Excel, it also add <style> and tags.

If you copy from another app, I don't see how ProseMirror's copying behavior is even involved.

I want to clarify the exact problem we are solving for here. Users often work with different tools and it involves copy-pasting content from one kind of source to another. Few categories of such tools:

a) MS office native apps - eg: Excel, Word
b) Browser - ProseMirror based editor in web app
c) Mail clients - eg: Gmail, Outlook

Now a major ask in such cases is to preserve the content formatting accurately when copy pasting across these tool categories.
From our experience and to the best of my understanding, this is a hard problem since browser HTML is not always compatible with mail. In addition, ProseMirror imposes a strict schema to enforce a structured document (for a good reason).
At the same time, it's really tough to explain to end users why the content formatting was lost or changed.

Hence to solve this problem, we rely on manipulating the content HTML via various regex rules, such that it becomes compatible for the target platform no matter which source it's copied from, so that it just works for the end user.

This boils down to taking control over two main workflows:

  1. Pasting into ProseMirror editor
  2. Copying from ProseMirror editor

For 1st, we are able to make use of transformPastedHTML to apply appropriate transformations before handing it over to ProseMirror for parsing and rendering the HTML. This takes care of handing many specific cases without which the unrecognized tags/attributes would have just been stripped off (a simple example could be converting div to p) and hence we are able to mostly retain the source formatting.

For 2nd, there is no equivalent method and hence we wanted a similar callback so that content can be manipulated before it's copied over to clipboard and made compatible with target mail client (eg: MS Outlook).

We have internally tested that the code changes as part of this PR works for our use-case. It would be great if you can review it and let us know if there are any concerns with our approach.


You can call to the default serializer, and then do your own transformations on the result.

Can you please clarify how this would be implemented and if it would be cheaper than just having a callback invoked on this line? To reiterate, we don't want to modify the default serializer and instead only run transformations on the final html before it's returned by this method.

Thank you in advance for your time!

@marijnh
Copy link
Member

marijnh commented Feb 21, 2024

Something like this...

new EditorView(document.querySelector(".full"), {
  state,
  clipboardSerializer: {
    serializeFragment(fragment, options, target) {
      let result = DOMSerializer.fromSchema(schema).serializeFragment(fragment, options, target)
      for (let p of Array.from(result.querySelectorAll("p")))
        p.appendChild(document.createTextNode("!"))
      return result
    }
  } as DOMSerializer
})

@ashu12chi
Copy link
Author

Something like this...

I tried this approach but the issue is the result which we are getting here is document fragment and not the HTML which we want to manipulate. You can verify that in this demo. If you copy the text of the editor and check the browser console for the value of result, it is a document fragment. Refer below image:
image

The callback which we are suggesting will allow us to manipulate the final HTML generated by the ProseMirror. Refer below:

<p data-pm-slice="0 0 []">Here is an example of callout usage....!</p>
<div class="callout info">
  <div class="content">
    <p>For your <strong>information!</strong>! </p>
  </div>
</div>
<p>Go to `doc.ts` to change type/content!</p>

In this PR we have made changes for adding the callback after the HTML is generated. (ref). The change I made for this is also quite straightforward. Refer below:

wrap = view.someProp("transformCopiedHTML", f => f(wrap!, view)) || wrap

Can you please point out if there are any concern in adding this callback?

Thank you in advance for your time!

@marijnh
Copy link
Member

marijnh commented Mar 5, 2024

I tried this approach but the issue is the result which we are getting here is document fragment and not the HTML

The callback in your PR also gets passed DOM elements. I'm really not sure why you keep rejecting the approach I'm proposing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants