Fix working directory reuse bugs and use persistent cache dir by default#82
Fix working directory reuse bugs and use persistent cache dir by default#82RadekManak wants to merge 3 commits intoopenshift-eng:mainfrom
Conversation
f27175c to
0eb5dc9
Compare
A source URL switch can leave stale refs in the existing .git directory and corrupt ancestry checks during rebase, so recreate git metadata to ensure clean refs.
Set credential.username under each repository-specific credential section so credential config no longer leaks globally across remotes.
Switch the CLI default from `.rebase` to an XDG-aware cache location, treat empty XDG_CACHE_HOME as unset, and add tests/documentation for the new behavior.
0eb5dc9 to
2ec3e43
Compare
|
@kaovilai: changing LGTM is restricted to collaborators DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kaovilai, RadekManak The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
WalkthroughUpdated rebasebot to use a persistent cache directory (e.g., Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Assessment against linked issues
Out-of-scope changes
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
rebasebot/bot.py (1)
555-559: Please add direct_init_working_dir()regression tests.The two highest-risk changes in this PR live here: repo-scoped credential sections and
.gitreinitialization on source changes. The added tests only cover CLI working-dir resolution, so these behaviors are still unpinned.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rebasebot/bot.py` around lines 555 - 559, Add direct regression tests that call the _init_working_dir() function to cover repo-scoped credential sections and .git reinitialization on source changes: write unit tests that (1) create a temporary repo, set a credential section like f'credential "{repo}"' and verify config.set_value results in correct credential entry when _init_working_dir() runs, and (2) simulate changing the source (e.g., different remote URL or empty .git) and assert that _init_working_dir() reinitializes the .git directory/state as expected; place tests to invoke rebasebot.bot._init_working_dir() directly (not via CLI) so these behaviors are pinned.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@rebasebot/cli.py`:
- Around line 44-47: The function _default_working_dir currently accepts a
relative XDG_CACHE_HOME; update it to reject non-absolute XDG_CACHE_HOME per the
XDG spec by reading os.environ.get("XDG_CACHE_HOME") into cache_home_raw,
checking if cache_home_raw and os.path.isabs(cache_home_raw) are True, and only
then use cache_home = cache_home_raw; otherwise fall back to
os.path.join(os.path.expanduser("~"), ".cache"); finally return
os.path.join(cache_home, "rebasebot"). Ensure you reference the
_default_working_dir function and the XDG_CACHE_HOME env var in the change.
---
Nitpick comments:
In `@rebasebot/bot.py`:
- Around line 555-559: Add direct regression tests that call the
_init_working_dir() function to cover repo-scoped credential sections and .git
reinitialization on source changes: write unit tests that (1) create a temporary
repo, set a credential section like f'credential "{repo}"' and verify
config.set_value results in correct credential entry when _init_working_dir()
runs, and (2) simulate changing the source (e.g., different remote URL or empty
.git) and assert that _init_working_dir() reinitializes the .git directory/state
as expected; place tests to invoke rebasebot.bot._init_working_dir() directly
(not via CLI) so these behaviors are pinned.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 02f62e80-d513-43d5-97b5-c2f1bee8786d
📒 Files selected for processing (5)
README.mdrebasebot/bot.pyrebasebot/cli.pytests/test_bot.pytests/test_cli.py
| # If the source URL changed, stale refs from the previous source repo remain | ||
| # in the local git store and can corrupt rebase operations (wrong ancestry | ||
| # checks, wrong commit filtering, wrong cherry-pick detection). Reinitializing | ||
| # .git is the only safe way to clear them. | ||
| if "source" in gitwd.remotes and gitwd.remotes["source"].url != source.url: | ||
| logging.warning( | ||
| "Source URL changed from %s to %s; reinitializing working directory " | ||
| "to remove stale refs", | ||
| gitwd.remotes["source"].url, | ||
| source.url, | ||
| ) | ||
| git_dir = gitwd.git_dir | ||
| gitwd.close() | ||
| shutil.rmtree(git_dir) | ||
| gitwd = git.Repo.init(path=workdir) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "*.py" -path "*/rebasebot/*" | head -20Repository: openshift-eng/rebasebot
Length of output: 181
🏁 Script executed:
cd rebasebot && wc -l bot.pyRepository: openshift-eng/rebasebot
Length of output: 77
🏁 Script executed:
sed -n '520,560p' rebasebot/bot.pyRepository: openshift-eng/rebasebot
Length of output: 1546
🏁 Script executed:
# Search for fetch-related code in bot.py
rg -n "fetch" rebasebot/bot.py -A 3 -B 1Repository: openshift-eng/rebasebot
Length of output: 1899
🏁 Script executed:
# Look for filter="blob:none" references
rg -n "filter" rebasebot/bot.py -B 2 -A 2Repository: openshift-eng/rebasebot
Length of output: 1191
🏁 Script executed:
# Check for git version checks or compatibility notes
rg -n "git.*2\.52|version|compatibility" rebasebot/bot.py -iRepository: openshift-eng/rebasebot
Length of output: 145
🏁 Script executed:
# Search for test files related to working directory initialization
find . -name "*test*.py" -type f | xargs rg -l "init_working_dir" 2>/dev/nullRepository: openshift-eng/rebasebot
Length of output: 90
🏁 Script executed:
# Check the issue `#79` context if mentioned in comments or docs
find . -name "*.md" -o -name "*.txt" -o -name "*.rst" | xargs rg -l "#79|issue.*79" 2>/dev/null | head -5Repository: openshift-eng/rebasebot
Length of output: 49
🏁 Script executed:
# Examine the fetch calls and their parameters in detail
sed -n '409p;572p;573p;576p;579p;582p;610p' rebasebot/bot.pyRepository: openshift-eng/rebasebot
Length of output: 455
🏁 Script executed:
# Look at the test file to see coverage of reused workdir scenarios
head -100 tests/test_rebases.pyRepository: openshift-eng/rebasebot
Length of output: 3167
🏁 Script executed:
# Search for any git version compatibility or 2.52 issues mentioned
rg -n "2\.52|forced|force.*fetch|--force" rebasebot/ tests/ -B 2 -A 2Repository: openshift-eng/rebasebot
Length of output: 49
🏁 Script executed:
# Check for any documentation or comments about the git 2.52 issue
find . -name "*.md" -o -name "*.txt" -o -name "*.rst" | xargs cat 2>/dev/null | rg -i "git.*2\.52|blob:none" -B 3 -A 3 | head -50Repository: openshift-eng/rebasebot
Length of output: 49
🏁 Script executed:
# Look for git version checks in the codebase
rg -n "version|compatibility" rebasebot/ -i | grep -i git | head -20Repository: openshift-eng/rebasebot
Length of output: 283
🏁 Script executed:
# Check git module to understand fetch parameters
python3 << 'EOF'
# Check GitPython documentation for fetch() behavior
import inspect
try:
from git import Repo
# Print the fetch method signature if available
help(Repo.remotes.__class__)
except:
print("GitPython not available in this environment")
EOFRepository: openshift-eng/rebasebot
Length of output: 110
🏁 Script executed:
# Search for PR or issue references in commits or documentation
git log --all --oneline | grep -i "#79\|issue.*79" | head -10Repository: openshift-eng/rebasebot
Length of output: 49
🏁 Script executed:
# Look for any comments or context about the reused workdir issue
rg -n "reused|workdir.*reuse|stale.*ref|ref.*stale" rebasebot/ -i -B 2 -A 2Repository: openshift-eng/rebasebot
Length of output: 771
🏁 Script executed:
# Check if there are any environment variables or configuration for git behavior
rg -n "GIT_|git.*config|git.*option" rebasebot/ -B 1 -A 1Repository: openshift-eng/rebasebot
Length of output: 1389
🏁 Script executed:
# Search for the actual issue or PR comments in markdown files or docs
find . -type f \( -name "*.md" -o -name "*.txt" -o -name "CHANGELOG*" -o -name "*.rst" \) -exec cat {} \; 2>/dev/null | head -500Repository: openshift-eng/rebasebot
Length of output: 20153
🏁 Script executed:
# Look more carefully at how remotes are used in rebase operations
sed -n '570,620p' rebasebot/bot.pyRepository: openshift-eng/rebasebot
Length of output: 2268
🏁 Script executed:
# Check if there are any comments about forced updates or ref handling
rg -n "forced\|force\|ref.*contaminat\|stale" rebasebot/bot.py -iRepository: openshift-eng/rebasebot
Length of output: 49
🏁 Script executed:
# Look at the entire _init_working_dir function to understand the complete flow
sed -n '500,650p' rebasebot/bot.pyRepository: openshift-eng/rebasebot
Length of output: 5675
The reused-workdir fix is incomplete and does not address known compatibility issues.
The reinit logic (lines 522–536) only handles the "source URL changed" case. When remote refs are rewritten in the same repository without a URL change, stale refs can still corrupt the working directory. Additionally, the fetch calls at lines 579 and 582 unconditionally use filter="blob:none", but the project's own AGENTS.md documents that issue #79 explicitly identified this filter as problematic on git 2.52—yet no version check or conditional handling exists in the code to avoid this on incompatible git versions.
Test coverage for reused workdir scenarios with ref rewrites and git 2.52 compatibility is absent.
| def _default_working_dir() -> str: | ||
| cache_home = os.environ.get("XDG_CACHE_HOME") or os.path.join( | ||
| os.path.expanduser("~"), ".cache") | ||
| return os.path.join(cache_home, "rebasebot") |
There was a problem hiding this comment.
❓ Verification inconclusive
Script executed:
cat -n rebasebot/cli.py | sed -n '40,50p'Repository: openshift-eng/rebasebot
Repository: openshift-eng/rebasebot
Exit code: 0
stdout:
40 except ValueError as e:
41 parser.error(str(e))
42
43
44 def _default_working_dir() -> str:
45 cache_home = os.environ.get("XDG_CACHE_HOME") or os.path.join(
46 os.path.expanduser("~"), ".cache")
47 return os.path.join(cache_home, "rebasebot")
48
49
50 # parse_cli_arguments parses command line arguments using argparse and returns
XDG Base Directory specification XDG_CACHE_HOME absolute path requirement
Yes. The XDG Base Directory Specification states that all paths set in the XDG environment variables—including XDG_CACHE_HOME—must be absolute; if an implementation encounters a relative path, it should treat it as invalid and ignore it. [1]
Also, if XDG_CACHE_HOME is unset or empty, the specified default is $HOME/.cache. [1]
Source: XDG Base Directory Specification v0.8 (May 8, 2021). [1]
Reject relative XDG_CACHE_HOME values per XDG Base Directory Specification.
The XDG Base Directory Specification (v0.8) requires that all XDG environment variables, including XDG_CACHE_HOME, be absolute paths; relative paths must be treated as invalid and ignored, falling back to the default $HOME/.cache. The current code accepts relative paths unchecked, which would resolve them relative to the caller's current working directory—reintroducing the persistence problem this change aims to eliminate.
Proposed fix
def _default_working_dir() -> str:
- cache_home = os.environ.get("XDG_CACHE_HOME") or os.path.join(
- os.path.expanduser("~"), ".cache")
+ cache_home = os.environ.get("XDG_CACHE_HOME")
+ if not cache_home or not os.path.isabs(cache_home):
+ cache_home = os.path.join(os.path.expanduser("~"), ".cache")
return os.path.join(cache_home, "rebasebot")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@rebasebot/cli.py` around lines 44 - 47, The function _default_working_dir
currently accepts a relative XDG_CACHE_HOME; update it to reject non-absolute
XDG_CACHE_HOME per the XDG spec by reading os.environ.get("XDG_CACHE_HOME") into
cache_home_raw, checking if cache_home_raw and os.path.isabs(cache_home_raw) are
True, and only then use cache_home = cache_home_raw; otherwise fall back to
os.path.join(os.path.expanduser("~"), ".cache"); finally return
os.path.join(cache_home, "rebasebot"). Ensure you reference the
_default_working_dir function and the XDG_CACHE_HOME env var in the change.
Three fixes for
_init_working_dir()related to working directory reuse:Reinitialize
.gitwhen source URL changes: When rebasebot reuses theworking directory (the default) with a different source repository, stale
tags and branches from the previous source accumulate in the local git
object store. These corrupt ancestry checks, commit filtering, and
cherry-pick detection. If the
sourceremote URL has changed since the lastrun,
.gitis now deleted and reinitialized before fetching, ensuring cleanrefs.
Scope
credential.usernameper repository URL: The globalcredential.username = x-access-tokenconfig was applied to all remotesincluding
source, causing git to attempt authentication with no passwordwhen fetching from a public source repository. The
usernameentry is nowwritten only into the per-URL credential sections for
destandrebase.Use XDG-aware persistent cache directory by default: The previous default
of
.rebasewas relative to the current working directory, meaning the cachelocation would vary depending on where rebasebot was invoked from. The new
default is
$XDG_CACHE_HOME/rebasebot(falling back to~/.cache/rebasebot),which is stable, user-specific, and outside the working directory.
Fixes: #79
Fixes: #36