Skip to content

Commit 9eab922

Browse files
authored
feat: Pi CLI/SDK conformance fixes and structured debug logging (#57)
- Fix Pi PTY Ctrl+C, width/rendering, node tool capabilities, and shutdown behavior - Add comprehensive Pi SDK coverage: session lifecycle, subprocess semantics, tool events, filesystem, network policy, permissions, path safety, resource cleanup - Add Pi cross-surface parity, error reporting, config discovery, repo workflow, session resume, and worktree mutation tests - Add OpenCode PTY real-provider test - Add SSRF protection and localhost RPC exemption support - Add host-side PTY allocation for HostBinaryDriver - Add structured debug-log channel for Pi and dev-shell investigations - Fix WasmVM shell cwd propagation, path-based command dispatch, and PTY resize handling - Audit and fix hardcoded monorepo dependency paths in publishable packages
1 parent 33e549e commit 9eab922

76 files changed

Lines changed: 15610 additions & 191 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CLAUDE.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,9 @@
1212

1313
- every publishable package must include a `README.md` with the standard format: title, tagline, and links to website, docs, and GitHub
1414
- if `package.json` has a `"files"` array, `"README.md"` must be listed in it
15+
- **no hardcoded monorepo/pnpm paths** — NEVER resolve dependencies at runtime using hardcoded relative paths into `node_modules/.pnpm/` or monorepo-relative `../../../node_modules/` walks; use `createRequire(import.meta.url).resolve("pkg/path")` or standard Node module resolution instead
16+
- **no phantom transitive dependencies** — if published runtime code calls `require.resolve("foo")` or `import("foo")`, `foo` MUST be declared in that package's `dependencies` (not just available transitively in the monorepo)
17+
- **`files` array must cover all runtime references** — if compiled `dist/` code resolves paths outside `dist/` at runtime (e.g., `../src/polyfills/`), those directories MUST be listed in the `"files"` array; verify with `pnpm pack --json` or `npm pack --dry-run` before publishing
1518

1619
## Testing Policy
1720

@@ -24,6 +27,11 @@
2427
- real-provider NodeRuntime CLI/tool tests that need a mutable temp worktree must pair `moduleAccess` with a real host-backed base filesystem such as `new NodeFileSystem()`; `moduleAccess` alone makes projected packages readable but leaves sandbox tools unable to touch `/tmp` working files
2528
- e2e-docker fixtures connect to real Docker containers (Postgres, MySQL, Redis, SSH/SFTP) — skip gracefully via `skipUnlessDocker()` when Docker is unavailable
2629
- interactive/PTY tests must use `kernel.openShell()` with `@xterm/headless`, not host PTY via `script -qefc`
30+
- before fixing a reported runtime, CLI, SDK, or PTY bug, first reproduce the broken state and capture the exact visible output (stdout, stderr, event payloads, or terminal screen) in a regression or work note; do not start by guessing at the fix
31+
- terminal-output and PTY-rendering bugs must use snapshot-style assertions against exact strings or exact screen contents under fixed rows/cols, not loose substring checks
32+
- if expected terminal behavior is unclear, run the same flow on the host as a control and compare the sandbox transcript/screen against that host output before deciding what to fix
33+
- be liberal with structured debug logging for complex interactive or long-running sessions so later manual repros can be diagnosed from artifacts instead of memory
34+
- debug logging for complex sessions should go to a separate sink that does not contaminate stdout/stderr protocol output; prefer structured `pino` logs with enough context to reconstruct process lifecycle, PTY events, command routing, and failures, while redacting secrets
2735
- kernel blocking-I/O regressions should be proven through `packages/core/test/kernel/kernel-integration.test.ts` using real process-owned FDs via `KernelInterface` (`fdWrite`, `flock`, `fdPollWait`) rather than only manager-level unit tests
2836
- inode-lifetime/deferred-unlink kernel integration tests must use `InMemoryFileSystem` (or another inode-aware VFS) and await the kernel's POSIX-dir bootstrap; the default `createTestKernel()` `TestFileSystem` does not exercise inode-backed FD lifetime semantics
2937
- kernel signal-handler regressions should use a real spawned PID plus `KernelInterface.processTable` / `KernelInterface.socketTable`; unit `ProcessTable` coverage alone does not prove pending delivery or `SA_RESTART` behavior through the live kernel

docs/api-reference.mdx

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -171,6 +171,7 @@ createNodeDriver(options?: NodeDriverOptions): SystemDriver
171171
| `commandExecutor` | `CommandExecutor` | Child process executor. |
172172
| `permissions` | `Permissions` | Access control rules. Deny-by-default. |
173173
| `useDefaultNetwork` | `boolean` | Enable default Node.js network adapter. |
174+
| `loopbackExemptPorts` | `number[]` | Loopback ports that bypass SSRF checks (with `useDefaultNetwork`). |
174175
| `processConfig` | `ProcessConfig` | Process metadata (cwd, env, argv, etc.). |
175176
| `osConfig` | `OSConfig` | OS metadata (platform, arch, homedir, etc.). |
176177

@@ -268,6 +269,26 @@ Each field accepts a `PermissionCheck`, which is either a boolean or a function
268269

269270
---
270271

272+
## Execution Methods
273+
274+
### `runtime.exec()`
275+
276+
Process-style execution. Accepts per-call environment, working directory, stdin, and stdio hooks. Use for automation loops, output observation, and CLI-style integrations.
277+
278+
```ts
279+
exec(code: string, options?: ExecOptions): Promise<ExecResult>
280+
```
281+
282+
### `runtime.run()`
283+
284+
Export-based evaluation. Returns the sandbox module's exports. Use when the sandbox should compute and return a value.
285+
286+
```ts
287+
run<T = unknown>(code: string, filePath?: string): Promise<RunResult<T>>
288+
```
289+
290+
---
291+
271292
## Execution Types
272293
273294
### `ExecOptions` (NodeRuntime)

docs/features/networking.mdx

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -163,6 +163,35 @@ const driver = createNodeDriver({
163163
| `dnsLookup(hostname)` | `Promise<DnsResult>` | DNS resolution |
164164
| `httpRequest(url, options?)` | `Promise<HttpResponse>` | Low-level HTTP request |
165165

166+
## Loopback RPC exemptions
167+
168+
The default network adapter blocks all loopback/private-IP requests as SSRF protection. To allow sandbox code to call a host-side RPC server on specific loopback ports, use `loopbackExemptPorts`:
169+
170+
```ts
171+
import { createNodeDriver, allowAllNetwork } from "secure-exec";
172+
173+
const driver = createNodeDriver({
174+
useDefaultNetwork: true,
175+
loopbackExemptPorts: [rpcPort],
176+
permissions: { ...allowAllNetwork },
177+
});
178+
```
179+
180+
Only the listed ports are exempt — all other loopback and private-IP requests remain blocked.
181+
182+
If you need more control (e.g. dynamic port discovery), construct the adapter directly:
183+
184+
```ts
185+
import { createNodeDriver, createDefaultNetworkAdapter, allowAllNetwork } from "secure-exec";
186+
187+
const driver = createNodeDriver({
188+
networkAdapter: createDefaultNetworkAdapter({
189+
initialExemptPorts: [rpcPort],
190+
}),
191+
permissions: { ...allowAllNetwork },
192+
});
193+
```
194+
166195
## Permission gating
167196

168197
Use a function to filter requests:

docs/features/output-capture.mdx

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,8 @@ icon: "message-lines"
1010

1111
Console output from sandboxed code is **not buffered** into result fields. `exec()` and `run()` do not return `stdout` or `stderr`. Use the `onStdio` hook to capture output.
1212

13+
The per-execution `onStdio` option is available on `exec()` only. To capture output from `run()` calls, set a runtime-level hook when creating the `NodeRuntime` (see [Default hook](#default-hook) below).
14+
1315
## Runnable example
1416

1517
```ts

docs/quickstart.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ icon: "rocket"
4343
</Step>
4444

4545
<Step title="Run code">
46-
Use `runtime.run()` to execute JavaScript and get back exported values. Use `runtime.exec()` for scripts that produce console output.
46+
Use `runtime.run()` to execute JavaScript and get back exported values. Use `runtime.exec()` for process-style execution with stdout/stderr observation, per-call environment overrides, and automation loops.
4747

4848
<CodeGroup>
4949
```ts Simple

docs/runtimes/node.mdx

Lines changed: 73 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -67,32 +67,101 @@ By default, all runtimes share a single V8 child process. You can pass a dedicat
6767

6868
## exec vs run
6969

70-
Use `exec()` when you care about side effects (logging, file writes) but don't need a return value.
70+
`NodeRuntime` exposes two execution methods with different signatures and intended use cases:
7171

7272
```ts
73-
const result = await runtime.exec("const label = 'done'; console.log(label)");
74-
console.log(result.code); // 0
73+
// Process-style execution — observe stdout/stderr, set env/cwd/stdin
74+
exec(code: string, options?: ExecOptions): Promise<ExecResult>
75+
76+
// Export-based evaluation — get computed values back
77+
run<T>(code: string, filePath?: string): Promise<RunResult<T>>
78+
```
79+
80+
| | `exec()` | `run()` |
81+
|---|---|---|
82+
| **Returns** | `{ code, errorMessage? }` | `{ code, errorMessage?, exports? }` |
83+
| **Per-call `onStdio`** | Yes | No (use runtime-level hook) |
84+
| **Per-call `env` / `cwd` / `stdin`** | Yes | No |
85+
| **Best for** | Side effects, CLI-style output, automation loops | Getting computed values back into the host |
86+
87+
### When to use `exec()`
88+
89+
Use `exec()` when sandboxed code produces **output you need to observe** or when you need per-call control over the execution environment. This is the right choice for AI SDK tool loops, code interpreters, and any integration where the result is communicated through `console.log` rather than `export`.
90+
91+
```ts
92+
// AI SDK tool loop — capture stdout from each step
93+
for (const step of toolSteps) {
94+
const result = await runtime.exec(step.code, {
95+
onStdio: (e) => appendToToolResult(e.message),
96+
env: { API_KEY: step.apiKey },
97+
cwd: "/workspace",
98+
});
99+
if (result.code !== 0) handleError(result);
100+
}
75101
```
76102

77-
Use `run()` when you need a value back. The sandboxed code should use `export default`.
103+
### When to use `run()`
104+
105+
Use `run()` when sandboxed code **exports a value** you need in the host. The sandbox code uses `export default` or named exports, and the host reads them from `result.exports`.
78106

79107
```ts
108+
// Evaluate a user-provided expression and get the result
80109
const result = await runtime.run<{ default: number }>("export default 40 + 2");
81110
console.log(result.exports?.default); // 42
82111
```
83112

113+
<Tip>
114+
If you find yourself parsing `console.log` output to extract a value, switch to `run()` with an `export`. If you need to watch a stream of output lines, switch to `exec()` with `onStdio`.
115+
</Tip>
116+
84117
## Capturing output
85118

86119
Console output is not buffered into the result. Use the `onStdio` hook to capture it.
87120

121+
The per-execution `onStdio` option is available on `exec()` only. To capture output from `run()`, set a runtime-level hook:
122+
88123
```ts
124+
// Per-execution hook (exec only)
89125
const logs: string[] = [];
90126
await runtime.exec("console.log('hello'); console.error('oops')", {
91127
onStdio: (event) => logs.push(`[${event.channel}] ${event.message}`),
92128
});
93129
// logs: ["[stdout] hello", "[stderr] oops"]
130+
131+
// Runtime-level hook (applies to both exec and run)
132+
const runtime = new NodeRuntime({
133+
systemDriver: createNodeDriver(),
134+
runtimeDriverFactory: createNodeRuntimeDriverFactory(),
135+
onStdio: (event) => console.log(event.message),
136+
});
137+
```
138+
139+
## Lifecycle
140+
141+
A single `NodeRuntime` instance is designed to be reused across many `.exec()` and `.run()` calls. Each call creates a fresh V8 isolate session internally, so per-execution state (module cache, budgets) is automatically reset while the underlying V8 process is reused efficiently.
142+
143+
```ts
144+
// Recommended: create once, call many times, dispose at the end
145+
const runtime = new NodeRuntime({
146+
systemDriver: createNodeDriver(),
147+
runtimeDriverFactory: createNodeRuntimeDriverFactory(),
148+
});
149+
150+
// AI SDK tool loop — each step reuses the same runtime
151+
for (const step of toolSteps) {
152+
const result = await runtime.exec(step.code, {
153+
onStdio: (e) => log(e.message),
154+
});
155+
}
156+
157+
// Clean up when the session is over
158+
runtime.dispose();
94159
```
95160

161+
Do **not** dispose and recreate the runtime between sequential calls. Calling `.exec()` or `.run()` on a disposed runtime throws `"NodeExecutionDriver has been disposed"`.
162+
163+
`dispose()` is synchronous and immediate — it kills active child processes and clears timers. Use `terminate()` (async) when you need to wait for graceful HTTP server shutdown before cleanup.
164+
96165
## TypeScript workflows
97166

98167
`NodeRuntime` executes JavaScript only. For sandboxed TypeScript type checking or compilation, use the separate `@secure-exec/typescript` package. See [TypeScript support](#typescript-support).

docs/sdk-overview.mdx

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -71,17 +71,19 @@ All host capabilities are deny-by-default. You opt in to what sandboxed code can
7171
Two methods for running sandboxed code:
7272

7373
```ts
74-
// exec() runs code for side effects, returns an exit code
74+
// exec() — process-style execution with stdout/stderr observation
7575
const execResult = await runtime.exec("console.log('hello')");
7676
console.log(execResult.code); // 0
7777

78-
// run() runs code and returns the default export
78+
// run() — export-based evaluation, returns computed values
7979
const runResult = await runtime.run<{ default: number }>(
8080
"export default 2 + 2"
8181
);
8282
console.log(runResult.exports?.default); // 4
8383
```
8484

85+
Use `exec()` for automation loops, CLI-style output capture, and per-call environment overrides. Use `run()` when the sandbox should return a value via `export`. See [exec vs run](/runtimes/node#exec-vs-run) for the full comparison.
86+
8587
## Capture output
8688

8789
Console output is not buffered by default. Use the `onStdio` hook to capture it:

docs/system-drivers/node.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ const driver = createNodeDriver({
5252
| `commandExecutor` | `CommandExecutor` | Custom command executor for child processes (see [Child processes](#child-processes)). |
5353
| `permissions` | `Permissions` | Permission callbacks for fs, network, child process, and env access. |
5454
| `useDefaultNetwork` | `boolean` | Use the built-in network adapter (fetch, DNS, HTTP client). |
55+
| `loopbackExemptPorts` | `number[]` | Loopback ports that bypass SSRF checks when using the default network adapter. |
5556
| `processConfig` | `ProcessConfig` | Values for `process.cwd()`, `process.env`, etc. inside the sandbox. |
5657
| `osConfig` | `OSConfig` | Values for `os.platform()`, `os.arch()`, etc. inside the sandbox. |
5758

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
/**
2+
* Initialize process cwd from PWD environment variable.
3+
*
4+
* WASI processes start with __wasilibc_cwd = "/" (from preopened directory
5+
* scanning). The kernel sets PWD in each spawned process's environment to
6+
* match the intended cwd. This constructor reads PWD and calls chdir()
7+
* to synchronize wasi-libc's internal cwd state with the kernel's.
8+
*
9+
* Installed into the patched sysroot so ALL WASM programs get correct
10+
* initial cwd, not just test binaries.
11+
*/
12+
13+
#include <stdlib.h>
14+
#include <unistd.h>
15+
16+
__attribute__((constructor, used))
17+
static void __init_cwd_from_pwd(void) {
18+
const char *pwd = getenv("PWD");
19+
if (pwd && pwd[0] == '/') {
20+
chdir(pwd);
21+
}
22+
}
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
Fix posix_spawn to propagate cwd to child processes.
2+
3+
posix_spawn previously passed an empty cwd (len=0) to proc_spawn,
4+
causing children to fall back to the kernel-worker's init.cwd instead
5+
of the parent's current working directory. This fix:
6+
7+
1. Processes FDOP_CHDIR file_actions to capture explicit cwd overrides
8+
2. Falls back to getcwd() when no explicit cwd is set
9+
3. Passes the resolved cwd to proc_spawn
10+
11+
This complements the kernel-side fix (setting PWD in env) and the
12+
init_cwd.c constructor (reading PWD at WASM startup) to ensure
13+
full cwd propagation from parent shell to spawned commands.
14+
15+
--- a/libc-bottom-half/sources/host_spawn_wait.c
16+
+++ b/libc-bottom-half/sources/host_spawn_wait.c
17+
@@ -252,6 +252,7 @@
18+
}
19+
20+
// Process file_actions in order: extract stdio overrides and handle close/open
21+
+ const char *spawn_cwd = NULL;
22+
uint32_t stdin_fd = 0, stdout_fd = 1, stderr_fd = 2;
23+
if (fa && fa->__actions) {
24+
for (struct __fdop *op = fa->__actions; op; op = op->next) {
25+
@@ -279,15 +280,24 @@
26+
else close(opened);
27+
break;
28+
}
29+
+ case FDOP_CHDIR:
30+
+ spawn_cwd = op->path;
31+
+ break;
32+
}
33+
}
34+
}
35+
36+
+ // Resolve cwd: explicit chdir action > current getcwd > empty (kernel fallback)
37+
+ char cwd_buf[1024];
38+
+ const char *cwd_str = spawn_cwd;
39+
+ if (!cwd_str && getcwd(cwd_buf, sizeof(cwd_buf))) {
40+
+ cwd_str = cwd_buf;
41+
+ }
42+
+
43+
uint32_t child_pid;
44+
uint32_t err = __host_proc_spawn(
45+
argv_buf, (uint32_t)argv_buf_len,
46+
envp_buf ? envp_buf : (const uint8_t *)"", (uint32_t)envp_buf_len,
47+
stdin_fd, stdout_fd, stderr_fd,
48+
- (const uint8_t *)"", 0,
49+
+ cwd_str ? (const uint8_t *)cwd_str : (const uint8_t *)"", cwd_str ? (uint32_t)strlen(cwd_str) : 0,
50+
&child_pid);
51+
52+
free(argv_buf);

0 commit comments

Comments
 (0)