Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: transformTags-like hook that's processed at element closing tag #685

Closed
f0x52 opened this issue Dec 9, 2024 · 4 comments
Closed

Comments

@f0x52
Copy link
Contributor

f0x52 commented Dec 9, 2024

The problem to solve

I'm using sanitize-html to transform links into a plaintext format that still conveys the necessary information.
Something like turning <a href="https://example.com">Example Link</a> into Example Link (https://example.com).

transformTag isn't sufficient because it gets executed during the onopentag handler, at which point the (textual) content of the tag hasn't been processed yet. With discard as the disallowedTagsMode you lose the link in href, and the only modification transformTag could make is overwriting the text element that gets processed in ontext with the link, but then you lose the original link text.

Proposed solution

A function that gets the same arguments as exclusiveFilter, so the frame with attributes, text, and expects to return an object that edits the element's contents.
For my specific usecase it's enough to overwrite the whole <a> element with a plain-text string, which would be easier to implement than a function that allows overwriting the tag name, attributes etc after they've already been written to the result string.

sanitizeHtml('<a href="https://example.com">Example Link</a>', {
	replaceTags: {
		'a': (frame) => {
			return `${frame.text} (${frame.attribs.href})`;
		}
	}
})

Alternatives

Using exclusiveFilter to store a link's href and text content in an external object, combined with the frame's tagPosition. Then return true to discard the anchor tag entirely, and splice in the new text content at the stored index after sanitizing has finished.

@boutell
Copy link
Member

boutell commented Dec 10, 2024

This does look useful. This might be a better use case for cheerio though, have you checked out that module?

@f0x52
Copy link
Contributor Author

f0x52 commented Dec 10, 2024 via email

@f0x52
Copy link
Contributor Author

f0x52 commented Dec 20, 2024

I ended up creating a wrapper around sanitize-html's using exclusiveFilter to implement this functionality, which works quite nicely: https://www.npmjs.com/package/sanitize-replace-html / https://git.pixie.town/f0x/sanitize-replace-html/src/branch/main/src/index.ts

Feel free to close this issue if there's no need to add this functionality directly in sanitize-html.

@boutell
Copy link
Member

boutell commented Jan 2, 2025

Nice, looks like we don't need another feature in core.

@boutell boutell closed this as completed Jan 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants