Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add content scripts section in specification #542

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
97 changes: 93 additions & 4 deletions specification/index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ Group: WECG
URL: https://w3c.github.io/webextensions/specification/index.html
Editor: Mukul Purohit, Microsoft Corporation https://www.microsoft.com, [email protected]
Editor: Tomislav Jovanovic, Mozilla https://www.mozilla.org/, [email protected]
Editor: Oliver Dunk, Google https://www.google.com, [email protected]
Abstract: [Placeholder] Abstract.
Markup Shorthands: markdown yes
</pre>
Expand All @@ -27,11 +28,11 @@ An optional directory containing strings as defined in <a href="#localization">l

## Other files

An extension may also contain other files, such as those referenced in the <a href="#key-content_scripts">content_scripts</a> and <a href="#key-background">background</a> part of the <a href="#manifest">Manifest</a>.
An extension may also contain other files, such as those referenced in the [[#key-content_scripts]] and [[#key-background]] parts of the [=manifest=].

# Manifest

A WebExtension must have a manifest file at its root directory.
A WebExtension must have a <dfn>manifest</dfn> file at its root directory.

## Manifest file

Expand Down Expand Up @@ -112,7 +113,7 @@ This key may be present.

### Key `content_scripts`

This key may be present.
The <a href="#key-content_scripts">`content_scripts`</a> key is a [=list=] of items representing [=content scripts=] that should be registered.

### Key `content_security_policy`

Expand Down Expand Up @@ -154,6 +155,8 @@ Filenames beginning with an underscore (`_`) are reserved for use by user agent.

# Isolated worlds

<dfn>Worlds</dfn> are isolated JavaScript contexts with access to the same underlying DOM tree but their own set of wrappers around those DOM objects.
oliverdunk marked this conversation as resolved.
Show resolved Hide resolved

# Unavailable APIs

# The `browser` global
Expand All @@ -172,6 +175,12 @@ Issue(62): Specify localization handling.

# Match patterns

A <dfn>match pattern</dfn> is a pattern used to match URLs. They are case-insensitive.

# Globs

A <dfn>glob</dfn> can be any [=string=]. It can contain any number of wildcards where * can match zero or more characters and ? matches exactly one character.

# Concepts

## Uniqueness of extension IDs
Expand All @@ -190,7 +199,57 @@ Issue(62): Specify localization handling.

## Content scripts

### Isolated worlds
<dfn>Content scripts</dfn> represent a set of JS and CSS files that should be injected into matching pages loaded by the user agent. They are injected using the steps in [[#inject-a-content-script]].

### Key `matches`

A [=list=] of [=match patterns=] that are used to decide which pages the user agent injects the content script into. This key is required.

### Key `exclude_matches`

A [=list=] of [=match patterns=] that should be used to exclude URLs from where the content script runs.

### Key `js`

A [=list=] of file paths, relative to the extension's package, that should be injected as scripts.

### Key `css`

A [=list=] of file paths, relative to the extension's package, that should be injected as stylesheets.

### Key `all_frames`

If `all_frames` is true, the content script must be injected into any subframes that match the other matching criteria for the content script. If false, content scripts will only be injected into top-level documents. See Defaults to false.

### Key `match_about_blank`

If this is true, use the URL of the parent frame when matching a child frame whose document URL has the `about` [=scheme=]. See also [[#determine-the-url-for-content-script-matching]]. Defaults to `false`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chrome currently matches all documents with an about scheme as described in the current copy (source link care of @oliverdunk), but Firefox appears to explicitly check against about:blank and about:srcdoc (source). @xeenon, how does Safari handle this key?

Any changes here should be reflected in the algorithm "Determine the URL for content script matching" section below.


Note: In Firefox, setting `match_about_blank` to `true` also allows injection into top-level `about:blank` pages.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a non-normative label to this block.


### Key `match_origin_as_fallback`

If this is true, use fallbacks as described in [[#determine-the-url-for-content-script-matching]].

No path is available when the URL to match against falls back to an origin. Therefore, when set, the user agent must not allow [[#key-matches]] to contain entries with a path other than `/*`.

Defaults to `false`.

### Key `run_at`

Specifies when the content script should be injected. Valid values are `document_start`, `document_end` and `document_idle`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a WebIDL definition for these values.


### Key `include_globs`

A list of [=globs=] that a page should match. A page matches if the URL matches both the [[#key-matches]] field and the [[#key-include_globs]] field.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably revise "a page should match" to a more generic concept like "potential injection context." Words are hard. @oliverdunk suggested maybe "document"?


### Key `exclude_globs`

A list of [=globs=] that should be used to exclude URLs from where the content script runs.

### Key `world`

The [=world=] any JavaScript scripts should be injected into. Defaults to `ISOLATED`. Valid values are `MAIN` and `ISOLATED`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a WebIDL definition for these values.


## Extension pages

Expand All @@ -203,3 +262,33 @@ Issue(62): Specify localization handling.
## Current behavior of cookie partitioning

# Version number handling

# Algorithms

## Determine the URL for content script matching

To determine the URL to use for a document when injecting a content script:

1. Let |url| be the document's URL.
1. If the document is within a child frame:
1. If the [=scheme=] of the document's URL is `about`, and `match_about_blank` or `match_origin_as_fallback` is set to true:
1. Set |url| to a URL based on the origin of the parent frame.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The phrase "URL based on the origin of the parent frame" feels a little … awkward? Definitely not required, but maybe we should have a definition or abstract algorithm that describes this and link out to it from here.

1. If the [=scheme=] of the document's URL is `data` and `match_origin_as_fallback` is set to true:
1. Set |url| to be a URL based on the origin of the parent frame.
1. If the [=scheme=] of the document's URL is `filesystem` or `blob` and `match_origin_as_fallback` is set to true:
1. Set |url| to be a URL based on the origin of the frame which created the URL.
1. Return |url|.

## Inject a content script

Issue: If the same extension specifies the same script twice, what should happen? ([bug](https://crbug.com/324096753))

To determine if a content script should be injected in a frame:

1. Let |url| be the result of running [[#determine-the-url-for-content-script-matching]].
1. If the extension does not have access to the origin, return.
1. If |url| is not matched by a match pattern in `matches`, return.
1. If `include_globs` is present and |url| is not matched by any pattern, return.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be a bit easier for the reader to parse.

Suggested change
1. If `include_globs` is present and |url| is not matched by any pattern, return.
1. If `include_globs` is present and |url| is not matched by any glob pattern, return.

1. If |url| matches an entry in `exclude_matches` or `exclude_globs`, return.
Comment on lines +290 to +292
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No change needed.

On my first pass through this I assumed that we could simplify the language here by combining steps 2 and 3 (similar to how step 4 is written), but @oliverdunk pointed out thatinclude_globs behaves as I expected. He also pointed out that the user scripts behavior of includeGlobs intentional diverges from content scripts – the the proposal notes that globs are:

  // Implemented as disjunction: runs in documents whose URL matches
  // "matches" or "includeGlobs", and not "excludeMatches" nor "excludeGlobs".

1. If this is a child frame, and `all_frames` is not `true`, return.
1. Otherwise, inject the content script. This should be done based on the `run_at` setting.