codemod improvements#2156
Conversation
|
@modelcontextprotocol/client
@modelcontextprotocol/codemod
@modelcontextprotocol/server
@modelcontextprotocol/express
@modelcontextprotocol/fastify
@modelcontextprotocol/hono
@modelcontextprotocol/node
commit: |
|
@claude review |
|
|
||
| // `sourceText` and `lines` are computed once from the pre-insertion text. | ||
| // Insertions below mutate sf, but we process in descending line order, so | ||
| // each insertText only shifts positions above the next insertion point — | ||
| // prior byte offsets stay valid. | ||
| const sourceText = sf.getFullText().replaceAll('\r\n', '\n'); | ||
| const lines = sourceText.split('\n'); | ||
|
|
||
| for (const diag of merged) { | ||
| const lineIndex = diag.line - 1; | ||
| if (lineIndex < 0 || lineIndex >= lines.length) continue; | ||
|
|
||
| if (lineIndex > 0 && lines[lineIndex - 1]!.includes(CODEMOD_ERROR_PREFIX)) continue; | ||
|
|
||
| const indent = lines[lineIndex]!.match(/^(\s*)/)?.[1] ?? ''; | ||
| const safeMessage = diag.message.replaceAll('*/', '* /'); | ||
| const comment = `${indent}/* ${CODEMOD_ERROR_PREFIX} ${safeMessage} */`; | ||
|
|
||
| const lineStart = lines.slice(0, lineIndex).reduce((sum, l) => sum + l.length + 1, 0); | ||
| sf.insertText(lineStart, comment + '\n'); |
There was a problem hiding this comment.
🔴 insertDiagnosticComments() computes insertion offsets from a CRLF-normalized copy of the source (replaceAll('\r\n', '\n') then summing line.length + 1), but sf.insertText() operates on the un-normalized text — so on CRLF files the offset is short by one character per preceding line and the /* @mcp-codemod-error */ comment lands in the middle of a prior line, syntactically corrupting the user's file. Relatedly, there is no check that the target line start is a safe boundary, so when an actionRequired node starts on an interior line of a multi-line template literal (e.g. a Schema.parse() reference inside an interpolation), the comment text is injected into the runtime string value. Fix by computing offsets from the un-normalized text (e.g. sf.compilerNode.getPositionOfLineAndCharacter(lineIndex, 0)) and skipping insertion when the position falls inside a template literal/JSX text.
Extended reasoning...
Issue 1 — CRLF offset drift corrupts files
insertDiagnosticComments() builds its line table from a normalized copy:
const sourceText = sf.getFullText().replaceAll('\r\n', '\n');
const lines = sourceText.split('\n');
...
const lineStart = lines.slice(0, lineIndex).reduce((sum, l) => sum + l.length + 1, 0);
sf.insertText(lineStart, comment + '\n');The offset is computed as if every line ends with a single \n (length + 1), but sf.insertText() is a purely positional insertion into the un-normalized buffer, where each CRLF line contributes length + 2. ts-morph's getFullText() preserves \r\n (verified empirically with files loaded via addSourceFilesAtPaths, the same path run() uses), so for a diagnostic on line N of a CRLF file the computed offset is short by exactly N−1 characters.
Step-by-step proof (CRLF): take a 4-line CRLF file where line 4 is const c = 3;. The normalized line table yields lineStart = 39, but the real start of line 4 in the CRLF buffer is offset 42. sf.insertText(39, ...) therefore lands 3 characters before the line start — inside line 3 — producing output like const c = 3/* @mcp-codemod-error test */ followed by an orphaned ; on the next line. In a realistic file with hundreds of lines the drift grows to hundreds of characters, so the comment can split identifiers, statements, or string literals. Since insertDiagnosticComments runs immediately before project.saveSync(), the corrupted text is written back to the user's source files. actionRequired diagnostics fire on files that receive no other rewrites (unknown import paths, unmigratable .tool() calls, XSchema.parse() usage), so the file is still fully CRLF when the insertion happens — the trigger is exactly the common Windows / core.autocrlf repo this codemod targets. The new commentInsertion.test.ts only joins inputs with '\n', so this is never exercised.
Issue 2 — no safe-boundary check at the insertion point
Even on pure-LF files, the insertion is a raw text splice at the start of the diagnostic node's start line with no awareness of what surrounds that position. Several actionRequired diagnostics attach to arbitrary expression nodes — specSchemaAccess emits them on XSchema.parse(v) / un-captured safeParse(v) references and on result.error.<subProp> accesses, found by walking every identifier in the file. If such an expression sits on an interior line of a multi-line template literal (e.g. a multi-line `Invalid request: ${CallToolRequestSchema.parse(data)}` error template), the node's start line begins inside the literal text, and the inserted /* @mcp-codemod-error ... */ line becomes part of the runtime string value. Verified empirically: the template head ends up containing the comment text, and because several diagnostic messages themselves contain backticks, the insertion can even terminate the template and produce parse errors. Either way a feature meant to be purely annotative silently changes program behavior (or breaks the build).
Why nothing catches this: the */ sanitization and the duplicate-comment check don't address position correctness, the descending-line processing only protects against self-invalidation between insertions, and all tests use LF-only inputs with statement-level diagnostics.
Suggested fix: compute the insertion offset from the real source text rather than a normalized copy — ts-morph already provides this via sf.compilerNode.getPositionOfLineAndCharacter(lineIndex, 0), or split on /\r?\n/ while accounting for the actual delimiter length per line. Then add a guard before inserting: check the token/node kind at the insertion position (or use the diagnostic node's ancestors) and skip the insertion — or fall back to the start line of the enclosing statement — when the line start falls inside a template literal, string literal, or JSX text region.
| const methodString = ALL_SCHEMA_TO_METHOD[originalName]; | ||
| if (!methodString) { | ||
| diagnostics.push( | ||
| warning( | ||
| actionRequired( | ||
| sourceFile.getFilePath(), | ||
| call.getStartLineNumber(), | ||
| call, | ||
| `Custom method handler: ${methodName}(${schemaName}, ...). ` + | ||
| `In v2, use the 3-arg form: ${methodName}('method/name', { params, result? }, handler). ` + | ||
| `See migration.md for details.` |
There was a problem hiding this comment.
🔴 handlerRegistrationTransform has no file-level hasMcpImports() gate (unlike contextTypes and mcpServerApi), and the 'Custom method handler' diagnostic is emitted before the isImportedFromMcp check — so any obj.setRequestHandler(SomeIdentifier, handler) call in a file with zero MCP imports (e.g. an unrelated library or user class with the same method name) now gets a /* @mcp-codemod-error */ comment physically written into it by the runner. Pre-PR this false positive was only a console warning; suggest gating the transform on hasMcpImports(sourceFile) like the sibling transforms before emitting the actionRequired diagnostic.
Extended reasoning...
What the bug is. handlerRegistrationTransform scans every CallExpression in every file under the target directory looking for setRequestHandler / setNotificationHandler calls. Unlike its sibling transforms — contextTypes gates on hasMcpImports(sourceFile) and mcpServerApi gates on isOriginalNameImportedFromMcp(sourceFile, 'McpServer') — it has no file-level MCP gate. When the first argument is an identifier that isn't in ALL_SCHEMA_TO_METHOD, the "Custom method handler" diagnostic is emitted at lines 41–50, before the isImportedFromMcp(sourceFile, schemaName) check on line 54. resolveOriginalImportName() scans all import declarations (not just MCP ones), so it provides no implicit gate either.
What this PR changes. Before this PR, that diagnostic was a warning — a console-only false positive. This PR converts it to actionRequired, which sets insertComment: true. In runner.ts, a file lands in fileResults whenever fileDiagnostics.length > 0 (even with 0 changes), and insertDiagnosticComments() then calls sf.insertText(...) followed by project.saveSync(). So the false positive is escalated from console noise to a /* @mcp-codemod-error */ comment being physically written into a source file that has nothing to do with MCP.
Step-by-step proof. Suppose the target project contains queue.ts with no MCP imports at all:
import { RpcRouter, PingSchema } from 'some-other-rpc-lib';
const router = new RpcRouter();
router.setRequestHandler(PingSchema, async req => handlePing(req));- The runner adds
queue.tsto the project (the glob covers all TS files in the target dir) and applies all transforms. handlerRegistrationTransformfinds thesetRequestHandlercall;PingSchemais an identifier,resolveOriginalImportNamereturnsPingSchema, which is not inALL_SCHEMA_TO_METHOD.- The
actionRequired(...)diagnostic is pushed (handlerRegistration.ts:41–50) — theisImportedFromMcpguard on line 54 is never reached for this branch. - In
run(),fileDiagnostics.length > 0, soqueue.tsis added tofileResultsdespitefileChanges === 0. - Because the diagnostic has
insertComment: true,insertDiagnosticComments()inserts/* @mcp-codemod-error Custom method handler: setRequestHandler(PingSchema, ...) ... */above the call, andproject.saveSync()writes the modified file to disk.
The user's completely unrelated file is now defaced with an MCP migration error comment.
Why existing code doesn't prevent it. The only guard, isImportedFromMcp(sourceFile, schemaName), runs after the diagnostic and only on the auto-rewrite path. That ordering is partly intentional — legitimate custom schemas in real MCP files are usually defined locally rather than imported from the SDK, so simply moving the isImportedFromMcp check above the diagnostic would suppress the legitimate case. The missing piece is a file-level gate.
Impact. The codemod runs over the whole target directory, so any project mixing MCP code with another request-handler-style API (or a user class that happens to have a setRequestHandler/setNotificationHandler method taking an identifier first argument) gets unwanted comments written into non-MCP source files. The trigger is somewhat narrow, which keeps this at the low end of normal severity, but unwanted mutation of unrelated user files is exactly what a codemod must not do.
How to fix. Add the same file-level gate used by the sibling transforms at the top of apply():
if (!hasMcpImports(sourceFile)) {
return { changesCount: 0, diagnostics: [] };
}This keeps the actionRequired comment for genuine custom schemas in MCP files while leaving non-MCP files untouched.
| if (result.commentCount > 0) { | ||
| console.log( | ||
| `${result.commentCount} location(s) marked with ${CODEMOD_ERROR_PREFIX} comments — search your code to find them:\n` + | ||
| ` grep -r '${CODEMOD_ERROR_PREFIX}' ${resolvedDir}\n` | ||
| ); | ||
| } |
There was a problem hiding this comment.
🟡 The suggested copy-paste command interpolates resolvedDir unquoted: grep -r '@mcp-codemod-error' ${resolvedDir}. For project paths containing spaces (common on macOS/Windows), the pasted command breaks or greps the wrong locations. Wrap the path in double quotes: grep -r '${CODEMOD_ERROR_PREFIX}' "${resolvedDir}".
Extended reasoning...
The bug: The new commentCount block in packages/codemod/src/cli.ts (lines 143–148) prints a hint that is explicitly intended to be copy-pasted by the user:
console.log(
`${result.commentCount} location(s) marked with ${CODEMOD_ERROR_PREFIX} comments — search your code to find them:\n` +
` grep -r '${CODEMOD_ERROR_PREFIX}' ${resolvedDir}\n`
);resolvedDir is path.resolve(targetDir) of a user-supplied directory argument, and it is interpolated into the suggested shell command without any quoting.
Code path that triggers it: Run mcp-codemod v1-to-v2 <dir> (without --dry-run) on a project where at least one transform emits an actionRequired diagnostic — e.g. a custom setRequestHandler schema, an unknown SDK import path, or a destructured context parameter. insertDiagnosticComments in runner.ts then inserts @mcp-codemod-error comments and returns a non-zero commentCount, so the CLI prints this hint with the unquoted path.
Step-by-step proof: Suppose the user runs the codemod on /Users/x/My Projects/app/src. The CLI prints:
grep -r '@mcp-codemod-error' /Users/x/My Projects/app/src
When the user copies and runs that, the shell word-splits the path into two arguments: /Users/x/My and Projects/app/src. grep -r then either errors with "No such file or directory" for both fragments, or — worse — silently greps an unrelated directory if one of the fragments happens to exist relative to the current working directory. Either way the user does not find the marked locations, defeating the purpose of the hint.
Why nothing prevents it: The string is only printed, never executed by the codemod itself, so there is no injection or correctness risk to the tool's own behavior — but there is also no quoting/escaping anywhere on this path, so the printed command is simply wrong for any path containing spaces (or other shell-special characters).
Impact and fix: Impact is limited to a misleading console hint, so this is a nit and should not block the PR. The fix is a one-character-pair change — wrap the interpolation in double quotes:
` grep -r '${CODEMOD_ERROR_PREFIX}' "${resolvedDir}"\n`|
|
||
| it('merges same-line diagnostics into a single comment', () => { | ||
| // Test the merge logic directly with synthetic data via the Project API. | ||
| // Two diagnostics on the same line should produce one comment with joined messages. | ||
| const project = new Project({ useInMemoryFileSystem: true }); | ||
| const sf = project.createSourceFile('test.ts', 'const a = 1;\nconst b = 2;\n'); | ||
|
|
||
| const fileResults: FileResult[] = [ | ||
| { | ||
| filePath: sf.getFilePath(), | ||
| changes: 0, | ||
| diagnostics: [ | ||
| { level: 'warning' as never, file: sf.getFilePath(), line: 2, message: 'First issue', insertComment: true }, | ||
| { level: 'warning' as never, file: sf.getFilePath(), line: 2, message: 'Second issue', insertComment: true } | ||
| ] | ||
| } | ||
| ]; | ||
|
|
||
| // Import and call insertDiagnosticComments indirectly by checking the runner behavior. | ||
| // Since insertDiagnosticComments is not exported, we verify via integration: | ||
| // construct a scenario that produces two diagnostics for the same node. | ||
| // Instead, we verify the output format by checking the source file directly. | ||
| // For a true unit test, we'd need to export the function. For now, verify the | ||
| // merge behavior via the runner with a crafted input. | ||
|
|
||
| // Actually, we can use a file that triggers two actionRequired diagnostics on the same line. | ||
| // This is hard to construct naturally, so we test the runner output instead. | ||
| // The key invariant is: if we ever get same-line diagnostics, only one comment appears. | ||
| // The earlier tests already cover the single-comment case. This test documents the intent. | ||
| expect(fileResults[0]!.diagnostics.filter(d => d.line === 2).length).toBe(2); |
There was a problem hiding this comment.
🟡 The test 'merges same-line diagnostics into a single comment' never exercises the merge logic — it only asserts a property of its own synthetic fixture (fileResults[0]!.diagnostics.filter(d => d.line === 2).length), without calling run() or insertDiagnosticComments, so the same-line merge branch in runner.ts has no real coverage despite the test name claiming it. Either export insertDiagnosticComments and unit-test it, craft an input through run() that produces two same-line actionRequired diagnostics, or remove/rename the test.
Extended reasoning...
What the test claims vs. what it does
The new test 'merges same-line diagnostics into a single comment' (packages/codemod/test/commentInsertion.test.ts:190-219) is named after the same-line merge behavior added to insertDiagnosticComments in packages/codemod/src/runner.ts (the prev.message += ' | ' + diag.message branch). However, the test body never invokes that code. It constructs an in-memory ts-morph Project, a throwaway source file, and a synthetic fileResults array with two diagnostics on line 2 — and then its only assertion is:
expect(fileResults[0]!.diagnostics.filter(d => d.line === 2).length).toBe(2);That is an assertion about the fixture literal the test just built, not about runner behavior. Neither run() nor insertDiagnosticComments (which is unexported) is called anywhere in the test.
Step-by-step proof
- The fixture is declared with exactly two entries, both with
line: 2andinsertComment: true. - The assertion filters that same array for
line === 2and checks the count is 2 — true by construction. - Delete or break the merge branch in
insertDiagnosticComments(e.g. remove theprev && prev.line === diag.linecase entirely): this test still passes, because nothing in it touchesrunner.ts. - The inline comments openly acknowledge this: "Since insertDiagnosticComments is not exported... This test documents the intent." along with leftover scaffolding ("Import and call insertDiagnosticComments indirectly...", "Actually, we can use...").
Why no other test covers the merge path
The other tests in commentInsertion.test.ts exercise single-comment insertion, dry-run, idempotency, indentation, and line-shift placement — but none produces two actionRequired diagnostics on the same line. 'inserts multiple comments in one file' uses two different lines, so the descending-sort + merge interaction (merged.at(-1) with equal lines) is never executed by the suite. Same-line merged diagnostics are reachable in practice (e.g. mcpServerApi can emit both the .tool() could not be migrated actionRequired and an emitWrapDiagnostic actionRequired for the same call line), so this is a real, untested path.
Impact
This is a test-quality issue, not a runtime defect: the suite reports a passing test named after behavior it does not check, giving false confidence. If the merge logic regresses (duplicate comments on one line, malformed joined messages), CI will stay green.
How to fix
Pick one of:
- Export
insertDiagnosticCommentsfromrunner.ts(or move it to a testable module) and unit-test it directly with the syntheticfileResultsthe test already builds — that fixture is 90% of the way there. - Drive it through
run()with an input that genuinely produces two same-line actionRequired diagnostics (e.g. aserver.tool(name, nonObjectSchema, cb)call that fails migration and triggers the non-zod schema diagnostic on the same line), then assert the output contains a single comment with both messages joined by|. - Otherwise remove or rename the test so the suite doesn't claim coverage it doesn't have.
Motivation and Context
How Has This Been Tested?
Breaking Changes
Types of changes
Checklist
Additional context