Skip to content

Add .rollback() api for workflow step promises#6330

Open
vaishnav-mk wants to merge 11 commits intocloudflare:mainfrom
vaishnav-mk:vaish/feat/saga-rollback-api
Open

Add .rollback() api for workflow step promises#6330
vaishnav-mk wants to merge 11 commits intocloudflare:mainfrom
vaishnav-mk:vaish/feat/saga-rollback-api

Conversation

@vaishnav-mk
Copy link
Contributor

adds steppromise wrapper with .rollback() api for workflow saga rollbacks. step.do() and step.waitforevent() now return steppromise which lets users register a compensation function before the step rpc fires.

  • sleep/sleepuntil are unchanged.
  • includes type declarations and tests.

@vaishnav-mk vaishnav-mk requested review from a team as code owners March 16, 2026 10:01
@vaishnav-mk
Copy link
Contributor Author

/bigbonk review this pr?

@vaishnav-mk
Copy link
Contributor Author

/bonk review this pr

@vaishnav-mk
Copy link
Contributor Author

the overlord hates me

@jasnell
Copy link
Collaborator

jasnell commented Mar 18, 2026

/bigbonk review this pr

Copy link
Contributor

@ask-bonk ask-bonk bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of StepPromise rollback API.

This review was generated by an AI assistant and may contain inaccuracies.

Summary of findings:

  • StepPromise is not a real Promise — it's a thenable. The d.ts declares StepPromise<T> extends Promise<T> which is misleading and will cause instanceof Promise to return false.
  • The rollback function return type in rpc.d.ts says Promise<void> but the test returns a value from the rollback fn (and verifies it). One of these needs to match the other.
  • Constructor-time prototype patching runs on every instantiation, even though it only patches once. This works but adds overhead on every new call for the hasOwnProperty + typeof + symbol check.
  • The generated-snapshot/ types were not regenerated — CI will likely fail the diff check.
  • Copyright year on new files should be 2026 (current year), not 2024.

Comment on lines +328 to +336
export interface StepPromise<T> extends Promise<T> {
rollback(
fn: (ctx: RollbackContext<T>) => Promise<void>,
): Promise<T>;
rollback(
config: WorkflowStepConfig,
fn: (ctx: RollbackContext<T>) => Promise<void>,
): Promise<T>;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StepPromise<T> extends Promise<T> is inaccurate. The runtime StepPromise class is a thenable (it has .then(), .catch(), .finally()) but does NOT extend Promise — it's a plain class. This means stepPromise instanceof Promise will be false at runtime, which contradicts this declaration.

Two options:

  1. Make the runtime StepPromise a proper subclass of Promise (complex, but accurate).
  2. Change the d.ts to declare StepPromise<T> as a PromiseLike<T> with the extra .rollback() method — this more accurately describes the runtime behavior.

Option 2 is simpler and more honest:

Suggested change
export interface StepPromise<T> extends Promise<T> {
rollback(
fn: (ctx: RollbackContext<T>) => Promise<void>,
): Promise<T>;
rollback(
config: WorkflowStepConfig,
fn: (ctx: RollbackContext<T>) => Promise<void>,
): Promise<T>;
}
export interface StepPromise<T> extends PromiseLike<T> {
rollback(
fn: (ctx: RollbackContext<T>) => Promise<void>,
): StepPromise<T>;
rollback(
config: WorkflowStepConfig,
fn: (ctx: RollbackContext<T>) => Promise<void>,
): StepPromise<T>;
}

Note: I also changed the return type of .rollback() from Promise<T> to StepPromise<T> since the runtime returns this. Returning Promise<T> prevents TypeScript from seeing .rollback() on the return value, which is fine for the "only once" semantics but makes the chained .rollback().rollback() error test impossible to type-check.


export interface StepPromise<T> extends Promise<T> {
rollback(
fn: (ctx: RollbackContext<T>) => Promise<void>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rollback callback return type is Promise<void>, but the test RollbackCallableWorkflow returns a non-void value from its rollback fn:

.rollback(async ({ error, output, stepName }) => {
  return { receivedError: error, receivedOutput: output, receivedStepName: stepName };
});

And the mock verifies this returned value (mock.rollbackResult). Either the d.ts should allow a return value (Promise<unknown>) or the test is testing unsupported behavior.

Which is the intended contract? If the engine will use the return value, Promise<void> is wrong.

};
}

class WorkflowEntrypointWrapper extends BaseWorkflowEntrypoint {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prototype patching in the constructor runs the hasOwnProperty + typeof + symbol check on every new UserWorkflow() instantiation, even though it only patches once (the first time). This is fine for correctness (the kStepWrapped guard prevents double-patching), but it's worth noting that in a hot path with many instantiations, this is unnecessary repeated work.

An alternative would be to patch once per class (e.g., via a WeakSet of already-patched prototypes, or patching at class definition time instead of construction time). Not blocking, but worth considering.

export const ServiceStub = entrypoints.ServiceStub;
export const WorkflowEntrypoint = entrypoints.WorkflowEntrypoint;

type RollbackFn = ((...args: unknown[]) => Promise<void>) | null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RollbackFn type says it returns Promise<void>, matching the d.ts. But as noted in the rpc.d.ts comment, the test actually returns a value from the rollback fn. This type should be consistent with the actual contract.

If the engine can use the return value:

Suggested change
type RollbackFn = ((...args: unknown[]) => Promise<void>) | null;
type RollbackFn = ((...args: unknown[]) => Promise<unknown>) | null;

// to classify entrypoints (workflowClasses vs actorClasses vs statelessClasses). A Proxy breaks
// identity comparison: `Proxy !== target`. A JS subclass preserves the chain:
// UserWorkflow -> our JS class -> C++ WorkflowEntrypoint (matched by runtime)
const BaseWorkflowEntrypoint =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cast as unknown as new (...args: unknown[]) => Record<string, unknown> discards all type information from the C++ entrypoint. This is understandable given the boundary between the internal module and TypeScript, but worth a comment explaining that this is intentional and that the type is deliberately widened because we only need the constructor chain for runtime classification.

Also: if the C++ WorkflowEntrypoint constructor signature ever changes (e.g., adds required parameters), this cast will silently hide the mismatch. Consider adding a brief note about this risk.

@@ -0,0 +1,100 @@
// Copyright (c) 2024 Cloudflare, Inc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copyright year should be 2026, not 2024.

modules = [
(name = "worker", esModule = embed "workflows-step-promise-test.js"),
],
compatibilityFlags = ["nodejs_compat", "experimental"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both the test worker and the workflows worker use "experimental" compat flag. Is there an intent to gate the rollback feature behind a specific compat flag? If this is a new API surface, it probably should be gated rather than always available. If it's intentionally not gated (because the wrapper is purely additive and doesn't change existing behavior), a comment explaining this decision would be helpful.

#promise: Promise<unknown> | null = null;
#launched = false;

get [Symbol.toStringTag](): string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Symbol.toStringTag returns 'StepPromise' which means Object.prototype.toString.call(sp) gives [object StepPromise]. This is a nice touch, but it further highlights that this is NOT a real Promise — Object.prototype.toString.call(new Promise(() => {})) gives [object Promise]. Users who check the string tag to distinguish promises from other objects will get unexpected results. This is probably fine for the workflow context, but worth documenting in the class comment.


// Wraps the step RPC stub so that step.do() and step.waitForEvent() return StepPromise instances.
// sleep() and sleepUntil() are passed through unchanged (they return Promise<void>, no rollback).
function wrapStep(jsStep: StepRpcStub): unknown {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type is unknown. Since this object is passed to user code as step in run(event, step), the user sees untyped methods. The type safety is provided by the d.ts declarations in rpc.d.ts, so this is fine at runtime, but it means the wrapper layer has zero internal type safety — any method name typo in this object literal (e.g., sleepuntil instead of sleepUntil) would be silently accepted by TypeScript. Consider typing the return value as matching the StepRpcStub interface (or a similar shape) to catch mistakes.

proto &&
Object.prototype.hasOwnProperty.call(proto, 'run') &&
typeof proto.run === 'function' &&
!proto[kStepWrapped]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: proto[kStepWrapped] is set to true (a boolean), but the check is !proto[kStepWrapped] which relies on truthiness. This is fine, but since the symbol is on a prototype shared by all instances, it's worth noting that this is visible to user code via Object.getOwnPropertySymbols(). Users could theoretically observe or tamper with it. Using a WeakSet<object> instead of a symbol property would hide this implementation detail from user introspection.

@ask-bonk
Copy link
Contributor

ask-bonk bot commented Mar 18, 2026

Review posted successfully. Here's a summary of the key findings:

High priority:

  1. StepPromise is not a real Promise — the d.ts declares StepPromise<T> extends Promise<T> but the runtime class is a thenable (plain class with .then()/.catch()/.finally()). instanceof Promise will be false. Suggested changing to extends PromiseLike<T>.

  2. Rollback function return type mismatchrpc.d.ts declares rollback callbacks as returning Promise<void>, but the test RollbackCallableWorkflow returns a value and the mock verifies it. The types and tests need to agree on the contract.

  3. generated-snapshot/ not regenerated — CI validates that the checked-in snapshots match the generated output. Since rpc.d.ts was modified, the snapshots need to be regenerated (likely via just generate-types).

Medium priority:
4. Copyright year — new files use 2024 instead of 2026.
5. Compat flag gating — no discussion of whether this new API surface should be gated behind a compat flag.
6. wrapStep return type is unknown — the wrapper object has no internal type safety, so method name typos would be silently accepted.

Low priority:
7. Constructor-time prototype patching repeats checks on every instantiation.
8. kStepWrapped symbol is observable via Object.getOwnPropertySymbols().
9. Truthiness check if (rollbackFn) could be more precise as if (rollbackFn !== null).

github run

@vaishnav-mk
Copy link
Contributor Author

hey @anonrig, i made a couple of changes based on bonk's review that i missed. specifically, StepPromise now extends Promise with a Symbol.species override so instanceof Promise is true, finally() works correctly and catch()/finally() overrides are no longer needed

apart from that, i've gated behind workflows_step_rollback experimental compat flag

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants