Fix git status parser to handle paths with spaces and renames (#271)#272
Fix git status parser to handle paths with spaces and renames (#271)#272nickeddy wants to merge 4 commits into
Conversation
lucieleblanc
left a comment
There was a problem hiding this comment.
Hi @nickeddy , thank you for the fix. Since this change touches generators, could you provide screenshots of the generators working in a local build of Warp? The patch process is explained in the test-local-warp skill. If you're testing on MacOS, you can also use cmd-shift-5 to start a screen recording of the app.
I'll be happy to review this again once a screen capture or a set of screenshots showing the changes are uploaded and displayed in the PR description or a comment. Thanks!
…tdev#271) Switch `files_for_staging` to `git status --short -z` so paths arrive NUL-separated and unquoted, and parse each record by taking everything after the 3-byte `XY ` status prefix instead of splitting on whitespace. This preserves filenames containing spaces, and the parser now skips the source record of rename (`R`) and copy (`C`) entries since only the destination exists on disk. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
90ade7e to
506d3e0
Compare
lucieleblanc
left a comment
There was a problem hiding this comment.
I agree with the general direction of these changes, but there are a few details we should clarify and another generator we should avoid regressing.
| } | ||
| } | ||
|
|
||
| fn post_process_tracked_files(output: &str) -> GeneratorResults { |
There was a problem hiding this comment.
This function is also used by get_changed_or_tracked_files, which still uses a newline-separated generator. We might need update that generator, or split this post-processing function into its own path to avoid regressing that generator.
In any case, please add a test covering get_changed_or_tracked_files so we can avoid regressing it too.
There was a problem hiding this comment.
I've confirmed there was a regression here due to my changes. I've added a test for this and updated get_changed_or_tracked_files to drop the sed requirement for mac compatibility. All tests are passing and I've verified the fix works still for files with spaces and didn't break changed/tracked files.
| // `git status --short -z` emits NUL-separated records of the form `XY <path>` | ||
| // (two-byte status code + space + raw pathname, no C-style quoting). Renames | ||
| // and copies are emitted as two records: `R <to>\0<from>\0` — the source | ||
| // path follows the destination and we skip it because that file no longer | ||
| // exists on disk. |
There was a problem hiding this comment.
This comment seems specific to renamed files. "The source path [...] no longer exists on disk" is a little misleading. Is that always true? What does the second path represent for copied files?
Please update the PR description as well to clarify this.
…andling
Switching files_for_staging to git status --short -z made post_process_tracked_files
a NUL-record parser, but get_changed_or_tracked_files still fed it newline-separated
output (git status --short | sed). split('\0') then yields a single record, collapsing
the whole blob into one garbage suggestion with embedded newlines.
Drop the sed pipe (its -z flag is GNU-only and breaks on macOS) and let git filter
directly: git diff [--cached] --diff-filter=AM --name-only -z. --cached inspects the
index, without it the working tree; --diff-filter=AM keeps the original M/A-only
behavior; --name-only -z emits bare NUL-separated paths handled by a dedicated
post_process_diff_name_only.
Also correct the post_process_tracked_files comment: for copies the rename/copy origin
still exists on disk, so we skip it because it is unchanged, not because it is gone.
Add tests covering the new get_changed_or_tracked_files path (incl. spaces, empty,
fatal-error) so it cannot silently regress again.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
A few more changes. In general, please prefer concise comments and clearer code, and comment inline in GitHub instead of adding comments about the diff in the diff itself.
Also, the PR description should not be this long. Please shorten it to explain the core of the change. The "verification" section should not include lints/fmt/clippy. Those are prerequisites; describe newly added/modified test cases instead.
Once again, please include screenshots of all affected generators in a comment to show they work as intended with all these edge cases. Thanks!
| /// It can't share the `git status --short -z` path used by | ||
| /// `files_for_staging`, because it needs to distinguish staged from | ||
| /// working-tree changes and that requires filtering NUL records — for which | ||
| /// the only portable tool is git itself (`sed -z`/`grep -z` are GNU-only and | ||
| /// break on macOS). It therefore runs `git diff [--cached] --diff-filter=AM | ||
| /// --name-only -z`, which emits bare NUL-separated pathnames with no `XY ` | ||
| /// status prefix. The post-processor must yield one suggestion per record | ||
| /// and preserve spaces in pathnames. |
| /// for the `get_changed_or_tracked_files` generator. | ||
| /// | ||
| /// That generator can't share the `git status --short -z` path used by | ||
| /// `files_for_staging`: it needs to distinguish staged (index) changes from | ||
| /// working-tree changes, and the only place that distinction is visible is the | ||
| /// command itself (the post-processor receives stdout alone). Doing the | ||
| /// distinction with `git status` would mean filtering NUL records in the shell, | ||
| /// which requires GNU-only `sed -z`/`grep -z` and breaks on macOS. Letting git | ||
| /// filter via `--diff-filter` keeps everything portable. | ||
| /// | ||
| /// `--name-only -z` emits bare NUL-separated pathnames (no `XY ` status prefix, | ||
| /// no C-style quoting), so each non-empty record is already a complete path. |
There was a problem hiding this comment.
Delete "for the get_changed_or_tracked_files generator". It's brittle and will go out of date too easily if we add another generator that uses this post-processing.
Most of this comment explains why get_changed_or_tracked_files uses git diff, which doesn't belong here. Keep comments minimal, concise and put them on the relevant functions only.
Cut the verbose doc/inline comments down to concise descriptions on the relevant functions; the git-diff rationale moves to a PR review comment. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Testing on nickeddy/warp@becc1fd + nickeddy/command-signatures@208a88b after setting up the following git folder:
|



Switch
files_for_stagingtogit status --short -zso paths arrive NUL-separatedand unquoted; parse each record by taking everything after the 3-byte
XYprefixinstead of splitting on whitespace, preserving filenames with spaces. Rename (
R)and copy (
C) entries span two records — we keep the<to>path and skip the origin.get_changed_or_tracked_filesshared that post-processor but still emittednewline-separated
git status … | sedoutput, which the NUL parser collapsed into onegarbage suggestion. It now uses
git diff [--cached] --diff-filter=AM --name-only -z(git does the filtering — portable, unlike GNU-only
sed -z), with its own NULpath-splitting post-processor.
Tests added:
test_post_process_diff_name_only(NUL paths incl. one with spaces),_empty,_fatal_error. Existingfiles_for_stagingtests already cover renames,copies, spaces, and untracked dirs.