feat: base machinery for AST-precise extraction via WASM plugins#44
Open
the-wondersmith wants to merge 9 commits into
Open
feat: base machinery for AST-precise extraction via WASM plugins#44the-wondersmith wants to merge 9 commits into
the-wondersmith wants to merge 9 commits into
Conversation
Untrack generated .windsurfrules fixtures (now gitignored) and clean each fixture dir before writing, so the idempotent AI-config generators don't skip and fail on a second run.
Opt-in host for user-supplied WASM AST plugins. Per-kind exports (parseRoutes/parseSchemas/parseImports, capability by presence) + a contractVersion() export; no manifest, no kind codes. Inert unless enabled.
Route + schema dispatch tries the native plugin first, falls back to the existing extractor. Adds 'native' confidence and strict-mode diagnostics. Imports are intentionally not dispatched (see graph.ts TODO).
--native-ast[=langs], --native-ast-strict, --plugin-dir; CODESIGHT_NATIVE_AST and CODESIGHT_PLUGIN_DIR; nativeAst config field (incl. no-TS-loader parsing). Strict mode reports unrun plugins and exits non-zero.
Minimal AssemblyScript marker plugin (committed prebuilt + checksums) exercising the ABI end-to-end against the real host. Excluded from the npm package by the files allowlist; assemblyscript added as a devDependency.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR adds machinery (gated behind explicit opt-in via cli flag) for expanding
codesight's repetoire of languages with AST-precision support via out-of-band (i.e. user supplied) WASM plugins. PR is intentionally scope-limited to only the host side of the plugin system: the plugin ABI, discovery, the opt-in CLI/env/config surface, the dispatch wiring, and a reference plugin used for conformance testing. Without explicit user opt-in (via cli flag) behavior is byte-identical to today, and the package maintains its zero runtime dependencies status. Additionally, the new machinery is explicitly "off" by default.Motivation
codesight's edge is AST-precise context, cheaply — but that precision is TypeScript-only today, because it's powered by the project's own TS compiler. The other 14 supported languages fall back to regex detection, which can much more easily miss things like framework patterns and/or mislabel routes/models. As a concrete example, during personal testing a rust project implementing an axum service with a.route("/x", get(h))was detected asactixand yielded 0 routes using the built-in regex.Captured Wins
assemblyscriptis dev-only;npm packexcludes the fixture).syn-based plugin recoveredactixattribute routes,axumroute chains, and struct fields that the built-in regex misidentified or missed entirely.codesightusers assert native parsing ran where expected, rather than hoping.codesight's big promises: zero runtime dependencies and no required toolchain/setupcodesightitselfcodesightcodebase whilecodesightiself stays lean, safe, and zero-depWhat This PR Includes
src/wasm/plugin-host.ts): per-kind exportsparseRoutes/parseSchemas/parseImports(capability detected by export presence) + acontractVersion()export. No manifest, no kind codes. UTF-8 in / JSON out over linear memory;alloc/dealloc/memory; packedi64return.src/ast/native-loader.ts): discovery waterfall (--plugin-dir→~/.codesight/plugins→$XDG_DATA_HOME/...→ install dir), version gating, domain-type adapters that stampconfidence: "native", and strict-mode diagnostics.src/detectors/*,src/core.ts): route/schema sites try the native plugin first, then fall back; native results are counted separately in the scan summary.--native-ast[=langs],--native-ast-strict,--plugin-dir;CODESIGHT_NATIVE_AST/CODESIGHT_PLUGIN_DIR; anativeAstconfig field (precedence CLI > env > config file, including the no-TS-loader config path).reference/ast-plugin/): a minimal AssemblyScript, marker-based fixture (committed prebuilt.wasm+ checksums) that exercises the ABI end-to-end. Excluded from the npm package by thefilesallowlist.docs/wasm-plugins.md.wasm-plugin-abiworkflow that rebuilds the reference plugin from source and runs conformance against the fresh build, with a checksum guard against a stale committed binary.The 9 commits are ordered for ease of review: ignore noise → fix a flaky test → host/loader → dispatch → CLI → tests → docs → reference fixture → CI. Each
featcommit builds on its own, all commits can be atomically rolled back without breaking builds.Scope & limitations (intentional — not bugs)
rust/go/python. Plugins are only consulted atcodesight's existing detector dispatch points, so "any user-specified language" is not yet literally true. Generalizing this (declaredlanguageId/extensions+ a language-driven pass) is a planned follow-up.parseImportsis defined in the contract but not dispatched during a scan. Dependency-graph edges must resolve to project-relative file paths, which a per-file plugin can't do without whole-project context; the export is reserved so enabling it later is purely additive. Built-in extraction handles imports today.detectComponentstakes the config param for symmetry but has no native component extraction (reserved)..wasmis a best-effort convenience copy; CI rebuilds it from source and the checksum guard catches drift/staleness.Testing
tests/native-ast.test.ts(10) — config/env/file resolution + precedence, and dispatch/strict-mode behavior via a mocked plugin provider.tests/reference-plugin.test.ts(10) — real-wasm conformance through the actual host (raw ABI per kind + the domain adapter) andcontractVersiongating.tests/monorepo.test.tsis made isolation-safe (unrelated pre-existing flakiness fixed along the way).testsruns the full suite green;wasm-plugin-abibuilds the reference plugin, verifies checksums, and runs the 20-assertion conformance/gating set 10/10. The checksum step confirmsascrebuilds the wasm byte-identically across macOS→linux.scan()integration test driving a real plugin end-to-end (host-level conformance + mocked dispatch cover the seams; the CLI path was verified manually).Planned Follow-ups (post-merge)
languageId/extensions+ a generic language-driven pass) — closes the schema gap and re-enablesparseImportsdispatch with project context.go/parsersynruff_python_astNote
Where plugin implementations should live will need to be discussed/decided on with
codesightmaintainersReviewer notes
docs/wasm-plugins.mdfor the contract, thensrc/wasm/plugin-host.tsandsrc/ast/native-loader.ts.reference/ast-plugin/is a test fixture, not a shipped plugin or a real parser — see itsREADME.md(incl. a "do not copy this as a template" note).npm pack --dry-runexcludes all ofreference/, and a scan without--native-astbehaves exactly as before.