Fix script to handle SARIF file recategorization#187
Fix script to handle SARIF file recategorization#187PiotrKorkus merged 2 commits intoeclipse-score:mainfrom
Conversation
|
The created documentation from the pull request is available at: docu-html |
1ad3c92 to
0bfca2d
Compare
45c4e88 to
9532b07
Compare
bf66e81 to
c36b2c1
Compare
There was a problem hiding this comment.
Pull request overview
This PR migrates the multi-repo CodeQL workflow from ad-hoc shell scripts to Python tooling, adds GitPython-based checkout support, and makes SARIF recategorization/filtering more robust in CI.
Changes:
- Replace
parse_repos.sh,checkout_repos.sh, andrecategorize_guidelines.shwith Python equivalents underscripts/tooling/cli/workflow/. - Introduce
scripts/tooling/lib/git_operations.py(GitPython-based shallow clone with optional token auth) and addGitPythonas a dependency. - Update the GitHub Actions workflow to use the new Python scripts, add timeouts/debug steps, and expand CodeQL
paths-ignore.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/workflow/recategorize_guidelines.sh | Removed legacy shell-based SARIF recategorization script. |
| scripts/workflow/parse_repos.sh | Removed legacy jq-based repo parsing script. |
| scripts/workflow/checkout_repos.sh | Removed legacy git clone loop script. |
| scripts/tooling/requirements.in | Adds GitPython to the tooling dependency inputs. |
| scripts/tooling/lib/git_operations.py | New GitPython-based clone helper used by workflow checkout. |
| scripts/tooling/cli/workflow/recategorize_guidelines.py | New Python SARIF recategorization + filtering step. |
| scripts/tooling/cli/workflow/parse_repos.py | New Python generator for repos.json + GHA outputs. |
| scripts/tooling/cli/workflow/checkout_repos.py | New Python repo checkout runner using git_operations. |
| scripts/tooling/cli/workflow/init.py | Adds a placeholder module for workflow CLI organization. |
| .github/workflows/codeql-multiple-repo-scan.yml | Switches workflow to Python scripts; adds caching/timeouts/cleanup and adjusts checkout order. |
| .github/codeql/codeql-config.yml | Expands paths-ignore to reduce irrelevant CodeQL scanning scope. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
PiotrKorkus
left a comment
There was a problem hiding this comment.
- Make codeql analysis executable with bazel command and use it in workflow.
- Raise errors directly instead of returning True/False and exitting from different place in code when there is no option to recover.
- Inconsistent use of prints and logging.
| - name: Cleanup repository checkouts | ||
| if: always() | ||
| run: | | ||
| echo "Cleaning up checked out repositories to free disk space" | ||
| rm -rf repos/ | ||
| df -h |
There was a problem hiding this comment.
- Frees up ~several GB of disk space (8 large repositories)
- Subsequent steps (recategorization, HTML generation) only need SARIF results, not source code
Added these for disk space freeing.
374e0f3 to
eec9f99
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
60b2e88 to
4cc33bf
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 10 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def validate_paths(): | ||
| """ | ||
| Validate that required files exist. | ||
|
|
||
| Note: Only validates files needed for recategorization if SARIF exists. | ||
| Returns: | ||
| True if validation passes or SARIF doesn't exist, False on critical errors | ||
| """ | ||
| # First check if SARIF file exists - if not, nothing to recategorize | ||
| if not Path(SARIF_FILE).exists(): | ||
| print(f"Info: SARIF file not found at {SARIF_FILE}", file=sys.stderr) | ||
| return False # Signal to skip recategorization | ||
|
|
||
| # SARIF exists, check for recategorization dependencies | ||
| optional_files = [ | ||
| RECATEGORIZE_SCRIPT, | ||
| CODING_STANDARDS_SCHEMA, | ||
| SARIF_SCHEMA, | ||
| ] | ||
|
|
||
| required_files = [ | ||
| CODING_STANDARDS_CONFIG, | ||
| ] | ||
|
|
||
| # Check required files (fail if missing) | ||
| for file_path in required_files: | ||
| if not Path(file_path).exists(): | ||
| print(f"Error: Required file not found: {file_path}", file=sys.stderr) | ||
| return False | ||
|
|
There was a problem hiding this comment.
validate_paths() returns False both when the SARIF file is missing (a non-error “skip” condition) and when a required file is missing (a real error). In main() these are treated the same and exit 0, which can hide misconfiguration. Consider returning a tri-state (e.g. enum) or raising on missing required files so CI fails when SARIF exists but config is missing.
adding GitPython lib cleaning the files for redundant made Codeql executable with bazel fixing bzel run issues adding a version removing timeout addresses several issues related to environment-specific paths and execution context removing the ref
16f4461 to
edcaf22
Compare
This PR migrates the multi-repo CodeQL workflow from ad-hoc shell scripts to Python tooling, adds GitPython-based checkout support, and makes SARIF recategorization/filtering more robust in CI.
Changes:
Replace parse_repos.sh, checkout_repos.sh, and recategorize_guidelines.sh with Python equivalents under scripts/tooling/cli/workflow/.
Introduce scripts/tooling/lib/git_operations.py (GitPython-based shallow clone with optional token auth) and add GitPython as a dependency.
Update the GitHub Actions workflow to use the new Python scripts, add timeouts/debug steps, and expand CodeQL paths-ignore.
Closes: #148
Closes: #149