Conversation
Implements the Node.js fs.glob APIs that were previously stubbed with ERR_UNSUPPORTED_OPERATION. Uses a custom glob-to-regex pattern matcher instead of the minimatch library which is unavailable in the Workers runtime. Closes cloudflare#5416
|
All contributors have signed the CLA ✍️ ✅ |
|
I have read the CLA Document and I hereby sign the CLA |
|
Hey @fraidev! thank you for this! I've done a first pass look over and have asked the AI agent to review also.. it should post it's findings in a moment. How closely does this follow the node.js implementation? (I haven't looked at it in a while) Is it a straight up port of the original or did you reimplement? The main reason I ask is that if it is a straight port over, we might need to carry over the Node.js copyright/license detail in the header of the new file(s). |
jasnell
left a comment
There was a problem hiding this comment.
Review Summary
Note: This review was generated by an AI assistant (Claude) and may contain inaccuracies. Please evaluate each suggestion on its merits.
Findings (6):
- [HIGH]
excludetype handling — unsafe cast incompileExcludewhenexcludeis a user function - [MEDIUM]
**in middle of exclude pattern doesn't match zero segments (a/**/bwon't matcha/b) - [MEDIUM]
walkGlobhas unbounded recursion depth — no depth guard - [MEDIUM] Non-null assertions (
!) used extensively — violates ts-style convention - [MEDIUM] Missing callback guard in
fs.glob— throws when callback omitted - [LOW] Copyright year says 2017-2022 in new file
| ): ((path: string) => boolean) | undefined { | ||
| if (exclude === undefined) return undefined; | ||
| if (typeof exclude === 'function') | ||
| return exclude as (path: string) => boolean; |
There was a problem hiding this comment.
[HIGH] The compileExclude function casts a user-provided (path: string | Dirent) => boolean function down to (path: string) => boolean, losing the Dirent type information. This works because the caller in globSync re-casts it back when isExcludeFunction is true, but the round-trip through compileExclude is misleading and fragile.
Consider handling user functions separately — don't route them through compileExclude at all. Have compileExclude only accept readonly string[] | undefined, and keep the user function as a separate variable in globSync. This eliminates both unsafe casts and the isExcludeFunction tracking.
This review comment was generated by an AI assistant.
| regex += '(?:\\/.*)?'; | ||
| } else { | ||
| // ** in middle: match zero or more middle segments | ||
| regex += '(?:\\/[^/]+)*\\/'; |
There was a problem hiding this comment.
[MEDIUM] The ** in middle case generates (?:\/[^/]+)*\/ which requires at least one trailing / separator. This means a pattern like a/**/b won't match a/b (zero intermediate segments) when used as an exclude pattern.
The ** glob should match zero or more path segments. Consider changing the regex to allow zero segments, e.g.:
| regex += '(?:\\/[^/]+)*\\/'; | |
| regex += '(?:\/[^/]+)*(?:\/)?'; |
Or handle it by generating an alternation: a\/b|a\/.*\/b.
Note: this only affects the globToRegex function used for exclude patterns — the walkGlob directory walker handles ** correctly via recursive enumeration.
This review comment was generated by an AI assistant.
| handle: DirEntryHandle | null; | ||
| } | ||
|
|
||
| export function walkGlob( |
There was a problem hiding this comment.
[MEDIUM] walkGlob recurses for each directory level and for each ** expansion. While the visitedGlobstar set prevents infinite loops for **, there's no depth limit for deeply nested directory structures. A deeply nested VFS could cause a stack overflow.
Consider adding a maxDepth parameter (e.g., 256) that decrements on each recursive call, returning early when exhausted.
This review comment was generated by an AI assistant.
|
|
||
| for (let i = 0; i < str.length; i++) { | ||
| const c = str[i]!; | ||
| if (c === '\\' && i + 1 < str.length) { |
There was a problem hiding this comment.
[MEDIUM] Non-null assertions (!) are used extensively throughout this file (~15+ times, e.g., str[i]!, segment[i]!, segments[segIdx]!, parts.pop()!). The project's ts-style convention says: "Never use the non-null assertion (!). Prefer proper narrowing or ?? defaultValue."
For loop-indexed array access where noUncheckedIndexedAccess forces T | undefined, the common pattern in this codebase is to restructure with for...of where possible. At minimum, parts.pop()! (in globToRegex) should use ?? '' since pop() on an empty array returns undefined.
This review comment was generated by an AI assistant.
| } else { | ||
| options = optionsOrCallback; | ||
| } | ||
| callWithSingleArgCallback(() => fssync.globSync(pattern, options), callback); |
There was a problem hiding this comment.
[MEDIUM] If glob is called as glob(pattern, options) without a callback (which is valid usage in Node.js), callback will be undefined after argument parsing, and callWithSingleArgCallback will throw ERR_INVALID_ARG_TYPE.
Node.js silently ignores missing callbacks for fs.glob. Consider adding a guard:
if (callback === undefined) return;before the callWithSingleArgCallback call.
This review comment was generated by an AI assistant.
| @@ -0,0 +1,519 @@ | |||
| // Copyright (c) 2017-2022 Cloudflare, Inc. | |||
There was a problem hiding this comment.
[LOW] This is a new file but the copyright header says 2017-2022. Should be updated to the current year.
This review comment was generated by an AI assistant.
Implements the Node.js fs.glob APIs that were previously stubbed with ERR_UNSUPPORTED_OPERATION. Uses a custom glob-to-regex pattern matcher.
Closes #5416