Skip to content

Commit 21c690a

Browse files
TrustedTypes: Add boilerplate for the xxxHtmlUnsafe() methods (mdn#40420)
* TrustedTypes: Add boilerplate for the xxxHtmlUnsafe() methods * Add hidden example for discussion * Update files/en-us/web/api/document/parsehtmlunsafe_static/index.md * minor tweaks * Element.setHTMLUnsafe() update to recommended * Element.setHTMLUnsafe() consistency fix * ShadowRoot.setHTMLUnsafe() update to recommended * Document.parseHTMLUnsafe() update to recommended * Apply easy suggestions from code review Co-authored-by: wbamberg <[email protected]> * Restructure to put use case reasons all in one place * Update index.md * Apply suggestions from code review Co-authored-by: wbamberg <[email protected]> * Apply suggestions from code review Co-authored-by: wbamberg <[email protected]> --------- Co-authored-by: wbamberg <[email protected]>
1 parent b61ec8f commit 21c690a

File tree

3 files changed

+291
-66
lines changed

3 files changed

+291
-66
lines changed

files/en-us/web/api/document/parsehtmlunsafe_static/index.md

Lines changed: 29 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,17 @@ browser-compat: api.Document.parseHTMLUnsafe_static
88

99
{{APIRef("DOM")}}
1010

11-
The **`parseHTMLUnsafe()`** static method of the {{domxref("Document")}} object is used to parse an HTML input, optionally filtering unwanted HTML elements and attributes, in order to create a new {{domxref("Document")}} instance.
11+
> [!WARNING]
12+
> This method parses its input as HTML, writing the result into the DOM.
13+
> APIs like this are known as [injection sinks](/en-US/docs/Web/API/Trusted_Types_API#concepts_and_usage), and are potentially a vector for [cross-site-scripting (XSS)](/en-US/docs/Web/Security/Attacks/XSS) attacks, if the input originally came from an attacker.
14+
>
15+
> You can mitigate this risk by always passing `TrustedHTML` objects instead of strings and [enforcing trusted types](/en-US/docs/Web/API/Trusted_Types_API#using_a_csp_to_enforce_trusted_types).
16+
> See [Security considerations](#security_considerations) for more information.
1217
13-
Unlike with {{domxref("Document.parseHTML_static", "Document.parseHTML()")}}, XSS-unsafe HTML entities are not guaranteed to be removed.
18+
> [!NOTE]
19+
> {{domxref("Document/parseHTML_static", "Document.parseHTML()")}} should almost always be used instead of this method — on browsers where it is supported — as it always removes XSS-unsafe HTML entities.
20+
21+
The **`parseHTMLUnsafe()`** static method of the {{domxref("Document")}} object is used to parse HTML input, optionally filtering unwanted HTML elements and attributes, in order to create a new {{domxref("Document")}} instance.
1422

1523
## Syntax
1624

@@ -22,13 +30,15 @@ Document.parseHTMLUnsafe(input, options)
2230
### Parameters
2331

2432
- `input`
25-
- : A string or {{domxref("TrustedHTML")}} instance defining HTML to be parsed.
33+
- : A {{domxref("TrustedHTML")}} or string instance defining HTML to be parsed.
2634
- `options` {{optional_inline}}
2735
- : An options object with the following optional parameters:
2836
- `sanitizer` {{optional_inline}}
2937
- : A {{domxref("Sanitizer")}} or {{domxref("SanitizerConfig")}} object which defines what elements of the input will be allowed or removed.
30-
Note that generally a `"Sanitizer` is expected than the to be more efficient than a `SanitizerConfig` if the configuration is to reused.
38+
This can also be a string with the value `"default"`, which applies a `Sanitizer` with the default (XSS-safe) configuration.
3139
If not specified, no sanitizer is used.
40+
41+
Note that generally a `Sanitizer` is expected than the to be more efficient than a `SanitizerConfig` if the configuration is to reused.
3242

3343
### Return value
3444

@@ -49,18 +59,25 @@ A {{domxref("Document")}}.
4959
The **`parseHTMLUnsafe()`** static method can be used to create a new {{domxref("Document")}} instance, optionally filter out unwanted elements and attributes.
5060
The resulting `Document` will have a [content type](/en-US/docs/Web/API/Document/contentType) of "text/html", a [character set](/en-US/docs/Web/API/Document/characterSet) of UTF-8, and a URL of "about:blank".
5161

52-
The suffix "Unsafe" in the method name indicates that, while the method does allow the input string to be filtered of unwanted HTML entities, it does not enforce the sanitization or removal of potentially unsafe XSS-relevant input.
53-
If no sanitizer configuration is specified in the `options.sanitizer` parameter, `parseHTMLUnsafe()` is used without any sanitization.
54-
Note that {{htmlelement("script")}} elements are not evaluated during parsing.
55-
5662
The input HTML may include [declarative shadow roots](/en-US/docs/Web/HTML/Reference/Elements/template#declarative_shadow_dom).
5763
If the string of HTML defines more than one [declarative shadow root](/en-US/docs/Web/HTML/Reference/Elements/template#declarative_shadow_dom) in a particular shadow host then only the first {{domxref("ShadowRoot")}} is created — subsequent declarations are parsed as {{htmlelement("template")}} elements within that shadow root.
5864

59-
`parseHTMLUnsafe()` should be instead of {{domxref("Document.parseHTML_static", "Document.parseHTML()")}} when parsing potentially unsafe strings of HTML that for whatever reason need to contain XSS-unsafe elements or attributes.
60-
If the HTML to be parsed doesn't need to contain unsafe HTML entities, then you should use `Document.parseHTML()`.
65+
`parseHTMLUnsafe()` doesn't perform any sanitization by default.
66+
If no sanitizer is passed as a parameter, all HTML entities in the input will be injected.
67+
68+
### Security considerations
69+
70+
The suffix "Unsafe" in the method name indicates that it does not enforce removal of all XSS-unsafe HTML entities (unlike {{domxref("Document/parseHTML_static", "Document.parseHTML()")}}).
71+
While it can do so if used with an appropriate sanitizer, it doesn't have to use an effective sanitizer, or any sanitizer at all!
72+
The method is therefore a possible vector for [Cross-site-scripting (XSS)](/en-US/docs/Web/Security/Attacks/XSS) attacks, where potentially unsafe strings provided by a user are injected into the DOM without first being sanitized.
73+
74+
You should mitigate this risk by always passing {{domxref("TrustedHTML")}} objects instead of strings, and [enforcing trusted type](/en-US/docs/Web/API/Trusted_Types_API#using_a_csp_to_enforce_trusted_types) using the [`require-trusted-types-for`](/en-US/docs/Web/HTTP/Reference/Headers/Content-Security-Policy/require-trusted-types-for) CSP directive.
75+
This ensures that the input is passed through a transformation function, which has the chance to [sanitize](/en-US/docs/Web/Security/Attacks/XSS#sanitization) the input to remove potentially dangerous markup (such as {{htmlelement("script")}} elements and event handler attributes), before it is injected.
76+
77+
Using `TrustedHTML` makes it possible to audit and check that sanitization code is effective in just a few places, rather than scattered across all your injection sinks.
78+
You should not need to pass a sanitizer to the method when using `TrustedHTML`.
6179

62-
Note that since this method does not necessarily sanitize input strings of XSS-unsafe entities, input strings should also be validated using the [Trusted Types API](/en-US/docs/Web/API/Trusted_Types_API).
63-
If the method is used with both a trusted types and a sanitizer, the HTML input will be passed through the trusted type transformation function before it is sanitized.
80+
If for any reason you can't use `TrustedHTML` (or even better, `setHTML()`) then the next safest option is to use `setHTMLUnsafe()` with the XSS-safe default {{domxref("Sanitizer")}}.
6481

6582
## Specifications
6683

files/en-us/web/api/element/sethtmlunsafe/index.md

Lines changed: 131 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,17 @@ browser-compat: api.Element.setHTMLUnsafe
88

99
{{APIRef("DOM")}}
1010

11-
The **`setHTMLUnsafe()`** method of the {{domxref("Element")}} interface is used to parse a string of HTML into a {{domxref("DocumentFragment")}}, optionally filtering out unwanted elements and attributes, and those that don't belong in the context, and then using it to replace the element's subtree in the DOM.
11+
> [!WARNING]
12+
> This method parses its input as HTML, writing the result into the DOM.
13+
> APIs like this are known as [injection sinks](/en-US/docs/Web/API/Trusted_Types_API#concepts_and_usage), and are potentially a vector for [cross-site-scripting (XSS)](/en-US/docs/Web/Security/Attacks/XSS) attacks, if the input originally came from an attacker.
14+
>
15+
> You can mitigate this risk by always passing `TrustedHTML` objects instead of strings and [enforcing trusted types](/en-US/docs/Web/API/Trusted_Types_API#using_a_csp_to_enforce_trusted_types).
16+
> See [Security considerations](#security_considerations) for more information.
1217
13-
Unlike with {{domxref("Element.setHTML()")}}, XSS-unsafe HTML entities are not guaranteed to be removed.
18+
> [!NOTE]
19+
> {{domxref("Element.setHTML()")}} should almost always be used instead of this method — on browsers where it is supported — as it always removes XSS-unsafe HTML entities.
20+
21+
The **`setHTMLUnsafe()`** method of the {{domxref("Element")}} interface is used to parse HTML input into a {{domxref("DocumentFragment")}}, optionally filtering out unwanted elements and attributes, and those that don't belong in the context, and then using it to replace the element's subtree in the DOM.
1422

1523
## Syntax
1624

@@ -22,14 +30,16 @@ setHTMLUnsafe(input, options)
2230
### Parameters
2331

2432
- `input`
25-
- : A string or {{domxref("TrustedHTML")}} instance defining HTML to be parsed.
33+
- : A {{domxref("TrustedHTML")}} instance or string defining HTML to be parsed.
2634
- `options` {{optional_inline}}
2735
- : An options object with the following optional parameters:
2836
- `sanitizer` {{optional_inline}}
29-
- : A {{domxref("Sanitizer")}} or {{domxref("SanitizerConfig")}} object which defines what elements of the input will be allowed or removed.
30-
Note that generally a `"Sanitizer` is expected than the to be more efficient than a `SanitizerConfig` if the configuration is to reused.
37+
- : A {{domxref("Sanitizer")}} or {{domxref("SanitizerConfig")}} object that defines what elements of the input will be allowed or removed.
38+
This can also be a string with the value `"default"`, which applies a `Sanitizer` with the default (XSS-safe) configuration.
3139
If not specified, no sanitizer is used.
3240

41+
Note that generally a `Sanitizer` is expected to be more efficient than a `SanitizerConfig` if the configuration is to reused.
42+
3343
### Return value
3444

3545
None (`undefined`).
@@ -46,56 +56,153 @@ None (`undefined`).
4656

4757
## Description
4858

49-
The **`setHTMLUnsafe()`** method is used to parse a string of HTML into a {{domxref("DocumentFragment")}}, optionally filtering out unwanted elements and attributes, and those that don't belong in the context, and then using it to replace the element's subtree in the DOM.
50-
51-
The suffix "Unsafe" in the method name indicates that while the method does allow the input string to be filtered of unwanted HTML entities, it does not enforce the sanitization or removal of potentially unsafe XSS-relevant input, such as {{htmlelement("script")}} elements, and script or event handler content attributes.
52-
If no sanitizer configuration is specified in the `options.sanitizer` parameter, `setHTMLUnsafe()` is used without any sanitization.
59+
The **`setHTMLUnsafe()`** method is used to parse an HTML input into a {{domxref("DocumentFragment")}}, optionally sanitizing it of unwanted elements and attributes, and discarding elements that the HTML specification doesn't allow in the target element (such as {{htmlelement("li")}} inside a {{htmlelement("div")}}).
60+
The `DocumentFragment` is then used to replace the element's subtree in the DOM.
5361

54-
The input HTML may include [declarative shadow roots](/en-US/docs/Web/HTML/Reference/Elements/template#declarative_shadow_dom).
62+
Unlike with {{domxref("Element.innerHTML")}}, [declarative shadow roots](/en-US/docs/Web/HTML/Reference/Elements/template#declarative_shadow_dom) in the input will be parsed into the DOM.
5563
If the string of HTML defines more than one [declarative shadow root](/en-US/docs/Web/HTML/Reference/Elements/template#declarative_shadow_dom) in a particular shadow host then only the first {{domxref("ShadowRoot")}} is created — subsequent declarations are parsed as `<template>` elements within that shadow root.
5664

57-
Like `setHTML()`, `setHTMLUnsafe()` may be used instead of {{domxref("Element.innerHTML")}} in order to parse strings of HTML that may contain declarative shadow roots.
58-
`setHTMLUnsafe()` should be instead of {{domxref("Element.setHTML()")}} when parsing potentially unsafe strings of HTML that for whatever reason need to contain XSS-unsafe elements or attributes.
59-
If strings to be injected don't need to contain unsafe HTML entities, then you should always use {{domxref("Element.setHTML()")}}.
65+
`setHTMLUnsafe()` doesn't perform any sanitization by default.
66+
If no sanitizer is passed as a parameter, all HTML entities in the input will be injected.
67+
It is therefore potentially even less safe that {{domxref("Element.innerHTML")}}, which disables {{htmlelement("script")}} execution when parsing.
68+
69+
### Security considerations
70+
71+
The suffix "Unsafe" in the method name indicates that it does not enforce removal of all XSS-unsafe HTML entities (unlike {{domxref("Element.setHTML()")}}).
72+
While it can do so if used with an appropriate sanitizer, it doesn't have to use an effective sanitizer, or any sanitizer at all!
73+
The method is therefore a possible vector for [Cross-site-scripting (XSS)](/en-US/docs/Web/Security/Attacks/XSS) attacks, where potentially unsafe strings provided by a user are injected into the DOM without first being sanitized.
74+
75+
You should mitigate this risk by always passing {{domxref("TrustedHTML")}} objects instead of strings, and [enforcing trusted types](/en-US/docs/Web/API/Trusted_Types_API#using_a_csp_to_enforce_trusted_types) using the [`require-trusted-types-for`](/en-US/docs/Web/HTTP/Reference/Headers/Content-Security-Policy/require-trusted-types-for) CSP directive.
76+
This ensures that the input is passed through a transformation function, which has the chance to [sanitize](/en-US/docs/Web/Security/Attacks/XSS#sanitization) the input to remove potentially dangerous markup (such as {{htmlelement("script")}} elements and event handler attributes), before it is injected.
77+
78+
Using `TrustedHTML` makes it possible to audit and check that sanitization code is effective in just a few places, rather than scattered across all your injection sinks.
79+
You should not have to pass a sanitizer to the method when using `TrustedHTML`.
80+
81+
If for any reason you can't use `TrustedHTML` (or even better, `setHTML()`) then the next safest option is to use `setHTMLUnsafe()` with the XSS-safe default {{domxref("Sanitizer")}}.
82+
83+
### When should `setHTMLUnsafe()` be used?
84+
85+
`setHTMLUnsafe()` should almost never be used if {{domxref("Element.setHTML()")}} is available, because there are very few (if any) cases where user-provided HTML input should need to include XSS-unsafe elements.
86+
Not only is `setHTML()` safe, but it avoids having to consider trusted types.
87+
88+
Using `setHTMLUnsafe()` might be appropriate if:
89+
90+
- You can't use `setHTML()` or trusted types (for whatever reason) and you want to have the safest possible filtering.
91+
In this case you might use `setHTMLUnsafe()` with the default {{domxref("Sanitizer")}} to filter all XSS-unsafe elements.
92+
- You can't use `setHTML()` and the input might contain declarative shadow roots, so you can't use {{domxref("Element.innerHTML")}}.
93+
- You have an edge case where you have to allow HTML input that includes a known set of unsafe HTML entities.
6094

61-
Note that since this method does not necessarily sanitize input strings of XSS-unsafe entities, input strings should also be validated using the [Trusted Types API](/en-US/docs/Web/API/Trusted_Types_API).
62-
If the method is used with both a trusted types and a sanitizer, the input string will be passed through the trusted transformation function before it is sanitized.
95+
You can't use `setHTML()` in this case, because it strips all unsafe entities.
96+
You could use `setHTMLUnsafe()` without a sanitizer or `innerHTML`, but that would allow all unsafe entities.
97+
98+
A better option here is to call `setHTMLUnsafe()` with a sanitizer that allows just those dangerous elements and attributes we actually need.
99+
While this is still unsafe, it is safer than allowing all of them.
100+
101+
For the last point, consider a situation where your code relies on being able to use unsafe `onclick` handlers.
102+
The following code shows the effect of the different methods and sanitizers for this case.
103+
104+
```js
105+
const target = document.querySelector("#target");
106+
107+
const input = "<img src=x onclick=alert('onclick') onerror=alert('onerror')>";
108+
109+
// Safe - removes all XSS-unsafe entities.
110+
target.setHTML(input);
111+
112+
// Removes no event handler attributes
113+
target.setHTMLUnsafe(input);
114+
target.innerHTML = input;
115+
116+
// Safe - removes all XSS-unsafe entities.
117+
const configSafe = new Sanitizer();
118+
target.setHTMLUnsafe(input, { sanitizer: configSafe });
119+
120+
// Removes all XSS-unsafe entities except `onclick`
121+
const configLessSafe = new Sanitizer();
122+
config.allowAttribute("onclick");
123+
target.setHTMLUnsafe(input, { sanitizer: configLessSafe });
124+
```
63125

64126
## Examples
65127

66-
### Basic usage
128+
### setHTMLUnsafe() with Trusted Types
67129

68-
This example shows some of the ways you can use `setHTMLUnsafe()` to inject a string of HTML.
130+
To mitigate the risk of XSS, we'll first create a `TrustedHTML` object from the string containing the HTML, and then pass that object to `setHTMLUnsafe()`.
131+
Since trusted types are not yet supported on all browsers, we define the [trusted types tinyfill](/en-US/docs/Web/API/Trusted_Types_API#trusted_types_tinyfill).
132+
This acts as a transparent replacement for the trusted types JavaScript API:
133+
134+
```js
135+
if (typeof trustedTypes === "undefined")
136+
trustedTypes = { createPolicy: (n, rules) => rules };
137+
```
138+
139+
Next we create a {{domxref("TrustedTypePolicy")}} that defines a {{domxref("TrustedTypePolicy/createHTML", "createHTML()")}} for transforming an input string into {{domxref("TrustedHTML")}} instances.
140+
Commonly implementations of `createHTML()` use a library such as [DOMPurify](https://github.com/cure53/DOMPurify) to sanitize the input as shown below:
141+
142+
```js
143+
const policy = trustedTypes.createPolicy("my-policy", {
144+
createHTML: (input) => DOMPurify.sanitize(input),
145+
});
146+
```
147+
148+
Then we use this `policy` object to create a `TrustedHTML` object from the potentially unsafe input string:
149+
150+
```js
151+
// The potentially malicious string
152+
const untrustedString = "abc <script>alert(1)<" + "/script> def";
153+
// Create a TrustedHTML instance using the policy
154+
const trustedHTML = policy.createHTML(untrustedString);
155+
```
156+
157+
Now that we have `trustedHTML`, the code below shows how you can use it with `setHTMLUnsafe()`.
158+
The input has been through the transformation function, so we don't pass a sanitizer to the method.
69159

70160
```js
71-
// Define unsanitized string of HTML
72-
const unsanitizedString = "abc <script>alert(1)<" + "/script> def";
73161
// Get the target Element with id "target"
74162
const target = document.getElementById("target");
75163

76-
// setHTML() with no sanitizer
77-
target.setHTMLUnsafe(unsanitizedString);
164+
// setHTMLUnsafe() with no sanitizer
165+
target.setHTMLUnsafe(trustedHTML);
166+
```
167+
168+
### Using setHTMLUnsafe() without Trusted Types
169+
170+
This example demonstrates the case where we aren't using trusted types, so we'll be passing sanitizer arguments.
171+
172+
The code creates an untrusted string and shows a number of ways a sanitizer can be passed to the method.
173+
174+
```js
175+
// The potentially malicious string
176+
const untrustedString = "abc <script>alert(1)<" + "/script> def";
177+
178+
// Get the target Element with id "target"
179+
const target = document.getElementById("target");
78180

79181
// Define custom Sanitizer and use in setHTMLUnsafe()
80182
// This allows only elements: div, p, button, script
81183
const sanitizer1 = new Sanitizer({
82184
elements: ["div", "p", "button", "script"],
83185
});
84-
target.setHTML(unsanitizedString, { sanitizer: sanitizer1 });
186+
target.setHTMLUnsafe(untrustedString, { sanitizer: sanitizer1 });
85187

86188
// Define custom SanitizerConfig within setHTMLUnsafe()
87189
// Removes the <script> element but allows other potentially unsafe entities.
88-
target.setHTMLUnsafe(unsanitizedString, {
190+
target.setHTMLUnsafe(untrustedString, {
89191
sanitizer: { removeElements: ["script"] },
90192
});
91193
```
92194

93195
### `setHTMLUnsafe()` live example
94196

95197
This example provides a "live" demonstration of the method when called with different sanitizers.
96-
The code defines buttons that you can click to inject a string of HTML that is not sanitized, and that uses and a custom sanitizer, respectively.
198+
The code defines buttons that you can click to inject a string of HTML.
199+
One button injects the HTML without sanitizing it at all, and the second uses a custom sanitizer that allows `<script>` elements but not other unsafe items.
97200
The original string and injected HTML are logged so you can inspect the results in each case.
98201

202+
> [!NOTE]
203+
> Because we want to show how the sanitizer argument is used, the following code injects a string rather than a trusted type.
204+
> You should not do this in production code.
205+
99206
#### HTML
100207

101208
The HTML defines two {{htmlelement("button")}} elements for calling the method with different sanitizers, another button to reset the example, and a {{htmlelement("div")}} element to inject the string into.

0 commit comments

Comments
 (0)