-
-
Notifications
You must be signed in to change notification settings - Fork 460
Prototype support for async native functions #4237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
38f7f58
to
9f8760e
Compare
// FIXME: figure out how to work with an async Context / Waker from the application (e.g. from Tokio) | ||
let waker = std::task::Waker::noop(); | ||
let mut context = std::task::Context::from_waker(waker); | ||
let result = f.poll(&mut context); | ||
match result { | ||
std::task::Poll::Pending => { | ||
//println!("Pending"); | ||
AsyncCallResult::Pending | ||
} | ||
std::task::Poll::Ready(result) => { | ||
*state = AsyncCallState::None; | ||
AsyncCallResult::Ready(result) | ||
} | ||
} | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay! I've been repeatedly looking at this PR again and I wondered, Couldn't we store a map of the "active" wakers in the Context
itself, then try to poll all of them whenever the Context
is trying to resume execution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries - it's does seem fiddly to figure out.
I agree that this current polling approach isn't good enough as is
I think some notable issues are that:
- It doesn't feel great to be materializing a special waker + context while these are things that you would normally expect your async runtime to own (i.e. tokio)
- I think there's probably a risk of futures polled this way not integrating with your async runtime properly. Tokio sleep did seem to work for me when testing but futures that depend on mio polling might not get handled properly - I'm not 100% sure.
- This approach effectively leads to a non-blocking busy loop on the CPU that will keep polling for async native functions to complete
In terms of:
Couldn't we store a map of the "active" wakers in the Context itself, then try to poll all of them whenever the Context is trying to resume execution?
I think my instinct is that, ideally this wouldn't involve any custom wakers at all.
The thing we have to poll currently is a boxed future which is currently saved in the NativeAsyncFunction::state
, based on an assumption that any particular function can only have one active call at a time, but maybe it would make sense to store this operation state in the Context
. It's conceptually like an internal register state for the operation.
If we could see the active/pending future from an async
context, outside of the sync parts of the VM implementation (such as in run_async_with_budget
) then we could do an .await
there and avoid the need for any ad-hoc waker/context for manually polling.
That sounds like it could work, and hopefully address all the issues above.
... assuming it's possible to improve how the async native function futures are polled, I'd be curious to know what you think about the general idea of Operations having a Pending status that allows Operations to take multiple VM cycles to execute?
I think I was concerned about the possibility that other Operations may use call
as an implementation detail.
I'm not entirely sure how practical it would be to audit the Operations that use call
to ensure that they can all, also support returning a Pending status and be resumed after the async function finishes.
Or otherwise I'm not sure if it's feasible to assume that those operations will never call a native async function.
Without something like stable coroutines in Rust it seems like it might be a pain to have to manually support more-complex resumable operations that involve call
s internally.
Maybe if there are only a small handful of operations that could internally call an async native function then it would be manageable to create a match
based state machine manually to allow them to be resumable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've taken a pass at reworking this to avoid the need for manually polling futures. The boxed futures are now stashed in the Context
and .await
is used from Context::run_async_with_budget
to wait for any native async function to complete.
This should avoid async runtime compatibility concerns (from polling with a task::Context and Waker that wasn't managed by the current runtime) and should also avoid polling for native function completion in a busy loop.
Ref: b97ab83
This experiments with enabling support for async NativeFunctions that are only async from the pov of the host, and appear as synchronous from within JavaScript. Instead of running the async functions as a Promise via enqueue_job, this works by allowing Operations to be executed over multiple VM cycles, so an Operation may start some async work in one step and then further steps can poll for completion of that work and finish the Operation. In particular this works by allowing Call Operations to return an `OpStatus::Pending`value that indicates that the same Call operation needs to be executed repeatedly, until it returns an `OpStatus::Finished` status. In the case of a `Pending` status, the program counter is reset and anything that was taken off the stack is pushed back so the same Operation can be re-executed. There is a new `NativeFunction::from_async_as_sync_with_captures()` that lets the host provide a (sync) closure that itself returns / spawns a boxed Future. This is tracked internally as an `Inner::AsyncFn`. Whenever the function is `__call__`ed then (assuming the operation isn't already in a pending / running state) a new Future is spawned via the application's closure and the Operation enters a "pending" state. When a NativeFunction is pending then each `__call__` will `poll()` the spawned `Future` to see if the `async` function has a result. This effectively stalls the VM at the same Opcode while still accounting for any cycle budget and periodically yielding to the application's async runtime while waiting for an async Call Operation to finish. Limitations / Issues ==================== == Busy Loop Polling == Even though the implementation does yield back to the application's async runtime when waiting for a NativeFunction to complete, the implementation isn't ideal because it uses a noop task Context + Waker when polling NativeFunction Futures. The effectively relies on the VM polling the future in a busy loop, wasting CPU time. A better solution could be to implement a shim Waker that would flag some state on the Boa engine Context, and then adapt the Future that's used to yield the VM to the executor so that it only becomes Ready once the async NativeFunction has signalled the waker. I.e. the Waker would act like a bridge/proxy between a spawned async NativeFunction and the the Future/Task associated with the VM's async `run_async_with_budget`. This way I think the VM could remain async runtime agnostic but would be able to actually sleep while waiting for async functions instead of entering a busy yield loop. == Requires PC rewind and reverting stack state == Ideally operations that may complete over multiple steps would maintain a state machine via private registers, whereby it would not be necessary to repeatedly rewind the program counter and re-push values to the stack so that the operation can be decoded and executed repeatedly from the beginning. == Only adapts Call Operation == Currently only the Call Operation handles async NativeFunctions but there are other Call[XYZ] Operations that could be adapted too. == Not compatible with composite Operations that `call()` == The ability to track pending async functions is implemented in terms of repeatedly executing an Opcode in the VM until it signals that it's not Pending. This currently relies on being able to reset and re-execute the Operation (such as reverting program counter and stack changes). There are lots of Operations that make use of JsObject::call() internally and they would currently trigger a panic if they called an async NativeFunction because they would not be able to "resolve()" the "Pending" status that would be returned by the `call()`. Ideally all Operations that use `__call__` or `__construct__` should be fully resumable in the same way that the Call Operation is now. This would presumably be easier to achieve with Rust Coroutines if they were stable because it would otherwise be necessary to adapt composite Operations into a state machine, similar to what the compiler does for an async Future, so they can yield for async function calls and be resumed by the VM.
This avoids manually polling boxed futures within the __call__ implementation for NativeAsyncFunctions. This instead stashes the boxed future under `Context::async_call` so that we can `.await` for it's completion within `Context::run_async_with_budget`. This approach should integrate much better with async runtimes like Tokio, since it doesn't involve manually polling with a task::Context + Waker that aren't managed by the current async runtime. This also means the thread can block waiting for async IO without polling for native function completion in a busy loop. This still needs further iteration, but hopefully serves as a usable draft / proof of concept.
9f8760e
to
b97ab83
Compare
Based on some discussion in #3442 (comment); this experiments with enabling support for async NativeFunctions that are only async from the pov of the host, and appear as synchronous from within JavaScript.
Instead of running the async functions as a Promise via
enqueue_job
, this works by allowing Operations to be executed over multiple VM cycles, so an Operation may start some async work in one step and then further steps can poll for completion of that work and finish the Operation.In particular this works by allowing Call Operations to return an
OpStatus::Pending
value that indicates that the same Call operation needs to be executed repeatedly, until it returns anOpStatus::Finished
status.In the case of a
Pending
status, the program counter is reset and anything that was taken off the stack is pushed back so the same Operation can be re-executed.There is a new
NativeFunction::from_async_as_sync_with_captures()
that lets the host provide a (sync) closure that itself returns / spawns a boxed Future. This is tracked internally as anInner::AsyncFn
.Whenever the function is
__call__
ed then (assuming the operation isn't already in a pending / running state) a new Future is spawned via the application's closure and the Operation enters a "pending" state.When a NativeFunction is pending then each
__call__
willpoll()
the spawnedFuture
to see if theasync
function has a result.This effectively stalls the VM at the same Opcode while still accounting for any cycle budget and periodically yielding to the application's async runtime while waiting for an async Call Operation to finish.
Limitations / Issues
Requires PC rewind and reverting stack state
Ideally operations that may complete over multiple steps would maintain a state machine via private registers, whereby it would not be necessary to repeatedly rewind the program counter and re-push values to the stack so that the operation can be decoded and executed repeatedly from the beginning.
Only adapts Call Operation
Currently only the Call Operation handles async NativeFunctions but there are other
Call[XYZ]
Operations that could be adapted too.Not compatible with composite Operations that make function calls internally
The ability to track pending async functions is implemented in terms of repeatedly executing an Opcode in the VM until it signals that it's not Pending.
This currently relies on being able to reset and re-execute the Operation (such as reverting program counter and stack changes).
There are various Operations that make use of
JsObject::call()
internally and they would currently trigger a panic if they called anasync
NativeFunction
because they would not be able to "resolve()" the "AsyncPending" status that would be returned by thecall()
.Ideally all Operations that use
__call__
or__construct__
should supportCallValue::AsyncPending
and be fully resumable in the same way that theCall
Operation
is now.This would presumably be easier to achieve with Rust Coroutines if they were stable because it would otherwise be necessary to adapt composite Operations into a state machine, similar to what the compiler does for an async Future, so they can yield for async function calls and be resumed by the VM.
Addresses: #3442