Stage: To be presented for advancement to Stage 1 at a future plenary.
Champions:
- Mark S. Miller (MM), Agoric, @erights
- Kris Kowal (KKL), Agoric, @kriskowal
- Caridy Patiño (CPO), Salesforce
Emeritus:
- Bradley Farias, GoDaddy
- Jean-Francois Paradis, Salesforce
Building upon Source phase imports
and ESM phase imports
which introduce ModuleSource and import source for integration with Wasm
and sending code to other workers, this proposal extends the [ModuleSource]
constructor to accept a hook for obtaining a module source for each of the
module's static and dynamic imports. This allows user code to choose whether
an import is satisfied by the host module system, replaced by a mock, or denied
outright.
This proposal picks up the remaining pieces of the previous proposal for First-class
ModuleandModule Sourcefrom the HardenedJSCompartmentproposal.
We extend the ModuleSource constructor to accept an optional handler.
interface ModuleSource {
constructor(source: string | ModuleSource, handler?: ModuleHandler);
}The ModuleSource constructor eagerly captures the handler and the
importHook and importMetaHook properties of the handler in internal slots.
The implementation will thereafter pass the handler as the receiver
object to any invocation of the importHook and importMetaHook, so the hooks
may consult other properties of the handler.
Aside, the handler is necessary for capturing a base specifier for resolving a relative import specifier, and allows 262 to avoid unnecssary specificity about resolution algorithms.
Because the identity of a ModuleSource is a key in a realm's module map for
purposes of denoting a corresponding module instance, we introduce the ability
to construct a ModuleSource from the precompiled text and host data of
another module source, but producing a distinct identity for purposes of
multiple instantiation.
type ModuleHandler = {
importHook?: ImportHook,
importMetaHook?: ImportMetaHook,
// ...
[name: string | symbol | number]: unknown,
};The importHook is a function that returns a promise for another
ModuleSource for an import specifier.
When the implementation instantiates a ModuleSource in a particular execution
context, the corresponding module instance must obtain a module source for
every static import statement and its transitive static imports.
After the module has executed, the implementation consults the importHook for
each dynamic import called from the emitted module instance.
The import hook receives an import attributes object regardless of whether
the import statement or expression expressed any import attributes.
The import attributes must be presented with keys in shortlex order,
and consequently, for the case that dynamic import provides a JavaScript
object for the import attributes bag, the implementation creates a
normalized copy for the second positional argument to the importHook.
type ImportHook = (
this: ModuleHandler,
specifier: ImportSpecifier,
attributes: Record<string, string>,
) => Promise<ModuleSource>;An import specifier is a string, lifted without alteration from the text of a module source.
type ImportSpecifier = string;For a module source that expresses an import.meta expression anywhere in its
source that is also constructed with an importMetaHook, the implementation
must invoke the importMetaHook immediately before executing the body of the module
so that the hook has an opportunity to populate it with additional properties.
type ImportMetaHook = (
this: ModuleHandler,
importMeta: ImportMeta,
) => void;
type ImportMeta = {
__proto__: null,
[name: string | symbol | number]: unknown,
};The Module Harmony group elected to avoid deferring the import meta hook to the first evaluation of an
import.metaexpression, but leaves the option open for a host hook, just not a user code hook.
We rely on the existing import mechanism: a dynamic import on a
ModuleSource in a particular execution context will get or create a module
instance for that source and drive that through execution and return a promise
for the corresponding module namespace.
In that process, the implementation will call importHook for each module
imported statically and dynamically in that module, giving it the import
specifier string and import attributes.
For example, importing a module that statically imports a module x,
import 'x';If the module map has no entry indicating that this module source has never
before imported "x", will induce the implementation to call the module
source's import hook, albeit the user code import hook, with the module
specifier "x" and an empty import attributes object, expecting it to produce
a promise for the ModuleSource of "x".
const source = await importHook('x', {});Thereafter, the implementation will never again consult the importHook
for the modules specifier "x" with no import attributes.
For another example, importing a module that imports a module "y" with
import attributes type: 'json' and extraneous: "very" will induce
the importHook for the identified ModuleSource once.
import 'y' with { type: 'json', extraneous: 'very' };The syntax of dynamic import does not itself reify a JavaScript object
for the import attributes.
But, for the purposes of calling an importHook, the implementation
must produce a canonicalized object with the keys of the import
attributes in shortlex order.
const source = await importHook('y', { extraneous: 'very', type: 'json' });The implementation also consults the importHook for:
- dynamic
importwith or without attributes, - static
import source, with or without attributes, and - dynamic
import.source, with or without attributes.
Dynamic import receives a reified attributes bag.
To provide consistent behavior, the implementation produces
a canonicalized copy of the given attributes bag for the importHook.
await import('z', { type: 'json', extraneous: 'very' });So the implementation then calls the importHook with the canonicalized import
attributes.
await importHook('z', { extraneous: 'very', type: 'json' });With an importHook, a test can isolate a specific module and mock one or more
of its dependencies.
In this example, we import a test fixture module and execute it in an
environment where its fs dependency is met by mock-fs and all other
imports fall through to the host.
import source fixture from 'fixture.js';
await import(new ModuleSource(fixture, {
importHook(specifier, attributes) {
if (specifier === 'fs') {
return import.source('mock-fs');
} else {
return import.source(specifier, attributes);
}
}
}));With a slight complication, the fixture can be imported in an environment where
its transitive dependencies experience the same distortion of fs to
mock-fs.
import source fixture from 'fixture.js';
const handler = {
importHook(specifier, attributes) {
if (specifier === 'fs') {
return import.source('mock-fs');
} else {
const source = await import.source(specifier, attributes);
return new ModuleSource(source, handler);
}
},
}
await import(new ModuleSource(fixture, handler));This example is admittedly reductive for illustration purposes.
The importHook is also responsible for resolving module-relative and
package-relative URLs.
In this example, we create a module source that transitively traps all imports
and loads source for them with fetch and treats all import specifiers as
URLs, resolving relative URLs to the location of the containing module source,
and provides import.meta.url.
const handlerPrototype = {
async importHook(importSpecifier) {
const url = new URL(importSpecifier, this.url).href;
const response = await fetch(url);
const text = await response.text();
return new ModuleSource(text, {
__proto__: handlerPrototype,
url,
});
},
importMetaHook(meta) {
meta.url = this.url;
},
};
const source = new ModuleSource(`
import 'example.js';
`, { __proto__, url: import.meta.url });
await import(source);With a slight complication, the hook can accommodate the type json import
attribute, or arbitrary types and arbitrary other import attributes.
const handlerPrototype = {
async importHook(importSpecifier, { type }) {
const url = new URL(importSpecifier, this.url).href;
const response = await fetch(url);
if (type === 'json') {
const json = await response.text();
// Virtual module sources (to be proposed) would obviate the need to
// embed the JSON in a JS module source.
return new ModuleSource(`export default ${json}`);
} else if (type === undefined) {
const text = await response.text();
return new ModuleSource(text, {
__proto__: handlerPrototype,
url,
});
}
},
importMetaHook(meta) {
meta.url = this.url;
},
};
const source = new ModuleSource(`
import 'example.js';
`, { __proto__, url: import.meta.url });
await import(source);This import hooks proposal creates a foundation for a new Global
proposal, which in turn provides a sufficient foundation
to implement Compartment proposal and
HardenedJS entirely in user code, without any
compromise in fidelity.
Within HardenedJS, we create a way to evaluate arbitrary guest code and control
its access to globals and modules, denying the guest access to any host modules
and powerful globals by default.
Consider the following potential guest code:
const indirectEval = eval;
indirectEval('import("node:fs")');Then, consider a host that naively attempts to isolate this code with just import hooks:
const source = new ModuleSource(`
const indirectEval = eval;
indirectEval('import("node:fs")');
`, {
importHook(specifier) {
// never reached
}
});We never reach the module source's importHook because indirect eval
uses the global execution environment associated with the eval function
by an internal slot of the function object itself.
So, we need a ModuleSource constructor that is associated with a
Global with an importHook.
And, consequently, we need each Global to have its own eval, Function,
AsyncFunction, ModuleSource, and Global constructors so code evaluated in
that global environment cannot escape.
Concretely, we propose that this scenario would capture all global and module access for guest code.
const log = [];
const globalThat = new Global({
importHook(specifier) {
log.push(`global ${specifier}`);
return new ModuleSource('');
},
});
globalThat.log = log;
const source = new globalThat.ModuleSource(`
import 'static-import';
import('dynamic-import');
eval('import("direct-eval-import")');
globalThis.eval('import("indirect-eval-import")');
new Function('return import("function-import")');
const AsyncFunction = (async () => {}).constructor;
new AsyncFunction('return import("async-function-import")');
const GeneratorFunction = (function*() {}).constructor;
new GeneratorFunction('return import("generator-function-import");')().next();
const AsyncGeneratorFunction = (async function*() {}).constructor;
new AsyncGeneratorFunction('return import("async-generator-function-import");')().next();
`, {
importHook(specifier) {
log.push(`local import ${specifier}`);
},
});
await import(source);
// log:
// local static-import
// local dynamic-import
// local direct-eval-import
// global indirect-eval-import
// global function-import
// global async-function-import
// global generator-function-import
// global async-generator-function-importThis proposal has no intersection semantics with Content Security Policy but rests on the earlier introduction of module sources and their execution model, which do.
The execution of a module source that was constructed by the host may
capture sufficient origin information that it is safe to execute without unsafe-eval
in a Content Security Policy.
A module source that was parsed from source text, where the text is not a
Trusted Type, will not have any attached origin information and would be unable
to execute without an unsafe-eval Content Security Policy.
We expect a future proposal to introduce import.now or import.sync as a
mechanism for synchronously growing the module graph and executing a module.
If so, ModuleSource handlers would need a corresponding synchronous import
hook that must not return a promise.
When a source module imports from a module specifier, you might not have the
source at hand to create the corresponding Module to be returned. If
importHook is synchronous, then you must have the source ready when the
importHook is invoked for each dependency.
Since the importHook is only triggered via the kicker (import(instance)),
going async there has no implications whatsoever.
In prior iterations of this, the user was responsible for looping thru the
dependencies, and preparing the instance before kicking the next phase.
That's no longer the case here, where the level of control on the different
phases is limited to the invocation of the importHook.
Yes, importHook can return a Module that was either import()ed already or
was returned by an importHook already.
The import.source behavior introduced in ESM source phase
imports lets an importHook defer to its parent module
loader to link an existing module or forward-reference to a module.
But, as yet, only ESM can link cycles.
Any import() statement inside a module source will result of a possible
importHook invocation on the Module, and the decision on whether or not to
call the importHook depends on whether or not the Module has already
invoked it for the specifier in question. So, a Module
most keep a map for every specifier and its corresponding Module to
guarantee the idempotency of those static and dynamic import statements.
User-defined and host-defined import-hooks will likely enforce stronger consistency between import behavior across module instances, but module instances enforce local consistency and some consistency in aggregate by induction of other modules.
Whether ModduleSource instances retain the original source may vary by host
and modules should reuse
HostHasSourceTextAvailable
in deciding whether to reveal their source, as they might with a toString
method.
The [module block][module-blocks] proposal may necessitate the retention of
text on certain hosts so that hosts can transmit use sources in their serial
representation of a module, as could be an extension to structuredClone.
The host-defined or virtual-host-defined import hook for a module is responsible for taking "import specifiers", the string expressed in both static and dynamic import statements, and resolving them relative to the logical or physical specifier for the module. Although it is possible to construct a host or virtual-host that only supports absolute or fully-qualified specifiers, nearly every module will have a referrer of one kind or another.
A module will typically have three distinct but often coincident data:
- This
referrerconcept, the basis for resolving import specifiers. - The
import.meta.url, the basis for locating resources adjacent to modules hosted on the web using URL math. Theimport.metaobject should not have aurlproperty unless it is useful for that purpose, and as such should not exist for modules located in bundles or archives, and should not be used in modules that will be bundled or archived, but will often be identical to thereferrerwhen it is present. - The origin of the source, the basis for a host deciding whether to allow the
module to be evaluated when a Content-Security-Policy is in place.
The origin will often be identical to the origin of
import.meta.url. The origin will exist in host data of the module source and must not be virtualizable, lest a virtual host be able to escape a same origin policy.
So, although these values are often related, they must be independent features.
Virtual host implementations cannot necessarily rely on the import.meta.url
to exist and virtual host implementations must be able to virtualize the
referrer.
The authors of this proposal vacilated between versions that made the referrer
explicit or implicit.
All of the authors shared a goal to make the Module and ModuleSource
features incrementally explainable.
Since it is possible to construct a trivial module system without the concept
of a referrer, a graduated tutorial introducing this feature (like this
explainer!) should be able to defer or formally ignore referral until it
becomes necessary to explain relative module specifier resolution.
To that end, we agreed that the referrer should be optional and appear later
in function signatures than other more essential parameters, if at all.
We relegated the referrer to at least be a property of the Module
constructor's options bag.
We then realized that the options bag was more analogous to the Proxy handler
object, such that the virtual host could choose to extend the handler with any
property of any type to carry the referrer, such that an import hook might
resolve a relative specifier using this.referrer (a string with the
module's own specifier), this.slot (a number representing the index into an
array of pre-computed resolutions for a bundle), or whatever other protocol the
virtual host chose to implement.
This was only possible because host-defined resolution behavior and virtual-host-defined resolution behavior do not need to communicate. Host-defined modules resolve import specifiers based on information in the referring module record's host data internal slot. At no point does a host-defined module need to consult the referrer of a virtual-host-defined module.
We additionally elected not to repeat what we believe to be a mistake in the
design of Proxy.
The Module constructor eagerly captures all the optional methods of the
module handler rather than accessing them afresh for each invocation.
So, we sacrifice some dynamism but still thread the handler as the target
object for all method invocations.
A consequence of this design is that any Module instance will retain a
handler, where an options bag would be immediately released to the GC
nursery.
And, if the handler carries a referrer, every Module instance needs a
unique handler instance, even though many module instances can share
the same hooks.
Earlier proposals introduced a separate Module or ModuleInstance constructor.
This would have made clear that the identity of a ModuleSource is meaningless
so that, if you wanted multiple instances of a module namespace from the same
module source, you could do so without reparsing or recompiling the source.
const source = new ModuleSource('export default {}');
const a = new Module(source);
const b = new Module(source);
(await import(a)) !== (await import(b));However, ESM Phase Imports made the identity of a
ModuleSource significant, without binding the source to a particular module
map (albeit a Realm or Realm of another Worker).
So, module sources are transferrable and can be imported in multiple realms,
but only singly-instaniated within one realm.
In practice, there is little to no motivation for multiple instances for a
single parsed source in a single realm.
So, we lose very little by consolidating Module and ModuleSource
in ModuleSource.
The handler becomes a non-transferrable component of the source that
overrides parent module loader behavior if it is imported into any realm of the
same Agent.
And, in turn, we have one less concept to litigate and one less name to choose for a fresh global constructor.
The ESM Phase Imports proposal settles the
issue of how to trigger import behavior.
But, for sake of completeness, an alternate design we discarded would
have put an import method on a Module instance.
This made sense when a Module execution was bound to the realm that
provided the Module intrinsic.
However, we elected to make ModuleSource identity significant
for the purposes of pass-invariant identity of import for a module source
through postMessage to another Worker in the HTML semantics, or
possibly to another realm of the same agent.
This means that a ModuleSource does not intrinsically identfy the module map
it will be used to populate.
The execution context where you use a dynamic import indicates the module map
the module source will populate.
We regret the dynamism of Proxy handlers, but also definitely need a
"handler" and not a mere, disposalbe options bag, because we need to use the
handler as a receiver for all subsequent calls to hooks.
We chose a way that has no precedent in the language.
Eagerly capturing all the hooks and the handler in ModuleSource
internal slots slots makes the hooks invariant over the lifetime
of the ModuleSource but also allows us to carry some state, like a referrer
module specifier, as state of the handler accessible like this.url.
The handler also allows 262 to evolve without concerning itself with
implementation- or host-specific module resolution algorithms.