Skip to content

Fix script to handle SARIF file recategorization#187

Merged
PiotrKorkus merged 2 commits intoeclipse-score:mainfrom
qorix-group:saumya_update_workflow
Apr 16, 2026
Merged

Fix script to handle SARIF file recategorization#187
PiotrKorkus merged 2 commits intoeclipse-score:mainfrom
qorix-group:saumya_update_workflow

Conversation

@Saumya-R
Copy link
Copy Markdown
Contributor

@Saumya-R Saumya-R commented Mar 13, 2026

This PR migrates the multi-repo CodeQL workflow from ad-hoc shell scripts to Python tooling, adds GitPython-based checkout support, and makes SARIF recategorization/filtering more robust in CI.

Changes:

Replace parse_repos.sh, checkout_repos.sh, and recategorize_guidelines.sh with Python equivalents under scripts/tooling/cli/workflow/.
Introduce scripts/tooling/lib/git_operations.py (GitPython-based shallow clone with optional token auth) and add GitPython as a dependency.
Update the GitHub Actions workflow to use the new Python scripts, add timeouts/debug steps, and expand CodeQL paths-ignore.

Closes: #148
Closes: #149

@github-actions
Copy link
Copy Markdown

The created documentation from the pull request is available at: docu-html

@Saumya-R Saumya-R force-pushed the saumya_update_workflow branch from 1ad3c92 to 0bfca2d Compare March 13, 2026 10:18
@Saumya-R Saumya-R force-pushed the saumya_update_workflow branch 2 times, most recently from 45c4e88 to 9532b07 Compare March 31, 2026 05:57
@Saumya-R Saumya-R marked this pull request as ready for review March 31, 2026 06:04
Comment thread scripts/workflow/checkout_repos.py
Comment thread scripts/tooling/cli/workflow/parse_repos.py Outdated
Comment thread .github/workflows/codeql-multiple-repo-scan.yml Outdated
Comment thread .github/workflows/codeql-multiple-repo-scan.yml Outdated
@Saumya-R Saumya-R marked this pull request as draft April 2, 2026 18:10
@Saumya-R Saumya-R force-pushed the saumya_update_workflow branch 5 times, most recently from bf66e81 to c36b2c1 Compare April 9, 2026 09:01
@Saumya-R Saumya-R marked this pull request as ready for review April 9, 2026 09:52
@Saumya-R Saumya-R requested a review from Copilot April 9, 2026 10:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the multi-repo CodeQL workflow from ad-hoc shell scripts to Python tooling, adds GitPython-based checkout support, and makes SARIF recategorization/filtering more robust in CI.

Changes:

  • Replace parse_repos.sh, checkout_repos.sh, and recategorize_guidelines.sh with Python equivalents under scripts/tooling/cli/workflow/.
  • Introduce scripts/tooling/lib/git_operations.py (GitPython-based shallow clone with optional token auth) and add GitPython as a dependency.
  • Update the GitHub Actions workflow to use the new Python scripts, add timeouts/debug steps, and expand CodeQL paths-ignore.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
scripts/workflow/recategorize_guidelines.sh Removed legacy shell-based SARIF recategorization script.
scripts/workflow/parse_repos.sh Removed legacy jq-based repo parsing script.
scripts/workflow/checkout_repos.sh Removed legacy git clone loop script.
scripts/tooling/requirements.in Adds GitPython to the tooling dependency inputs.
scripts/tooling/lib/git_operations.py New GitPython-based clone helper used by workflow checkout.
scripts/tooling/cli/workflow/recategorize_guidelines.py New Python SARIF recategorization + filtering step.
scripts/tooling/cli/workflow/parse_repos.py New Python generator for repos.json + GHA outputs.
scripts/tooling/cli/workflow/checkout_repos.py New Python repo checkout runner using git_operations.
scripts/tooling/cli/workflow/init.py Adds a placeholder module for workflow CLI organization.
.github/workflows/codeql-multiple-repo-scan.yml Switches workflow to Python scripts; adds caching/timeouts/cleanup and adjusts checkout order.
.github/codeql/codeql-config.yml Expands paths-ignore to reduce irrelevant CodeQL scanning scope.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/tooling/cli/workflow/parse_repos.py Outdated
Comment thread scripts/tooling/requirements.in
Comment thread scripts/tooling/lib/git_operations.py
Comment thread .github/workflows/codeql-multiple-repo-scan.yml Outdated
Comment thread scripts/tooling/cli/workflow/recategorize_guidelines.py
Comment thread scripts/tooling/cli/workflow/recategorize_guidelines.py
Copy link
Copy Markdown
Contributor

@PiotrKorkus PiotrKorkus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Make codeql analysis executable with bazel command and use it in workflow.
  2. Raise errors directly instead of returning True/False and exitting from different place in code when there is no option to recover.
  3. Inconsistent use of prints and logging.

Comment thread .github/workflows/codeql-multiple-repo-scan.yml Outdated
Comment thread .github/workflows/codeql-multiple-repo-scan.yml Outdated
Comment on lines +97 to +102
- name: Cleanup repository checkouts
if: always()
run: |
echo "Cleaning up checked out repositories to free disk space"
rm -rf repos/
df -h
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it needed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Frees up ~several GB of disk space (8 large repositories)
  • Subsequent steps (recategorization, HTML generation) only need SARIF results, not source code

Added these for disk space freeing.

Comment thread .github/workflows/codeql-multiple-repo-scan.yml Outdated
Comment thread .github/workflows/codeql-multiple-repo-scan.yml Outdated
Comment thread scripts/tooling/cli/workflow/checkout_repos.py Outdated
Comment thread scripts/tooling/cli/workflow/checkout_repos.py Outdated
Comment thread scripts/tooling/cli/workflow/parse_repos.py Outdated
Comment thread scripts/tooling/lib/git_operations.py Outdated
Comment thread scripts/tooling/lib/git_operations.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/tooling/cli/workflow/checkout_repos.py Outdated
Comment thread scripts/tooling/cli/workflow/checkout_repos.py
Comment thread .github/workflows/codeql-multiple-repo-scan.yml Outdated
Comment thread scripts/tooling/cli/workflow/recategorize_guidelines.py
Comment thread scripts/tooling/cli/workflow/recategorize_guidelines.py
Comment thread scripts/tooling/BUILD Outdated
Comment thread scripts/tooling/cli/workflow/checkout_repos.py
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 10 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/tooling/requirements.txt Outdated
Comment thread scripts/tooling/requirements.txt Outdated
Comment on lines +34 to +63
def validate_paths():
"""
Validate that required files exist.

Note: Only validates files needed for recategorization if SARIF exists.
Returns:
True if validation passes or SARIF doesn't exist, False on critical errors
"""
# First check if SARIF file exists - if not, nothing to recategorize
if not Path(SARIF_FILE).exists():
print(f"Info: SARIF file not found at {SARIF_FILE}", file=sys.stderr)
return False # Signal to skip recategorization

# SARIF exists, check for recategorization dependencies
optional_files = [
RECATEGORIZE_SCRIPT,
CODING_STANDARDS_SCHEMA,
SARIF_SCHEMA,
]

required_files = [
CODING_STANDARDS_CONFIG,
]

# Check required files (fail if missing)
for file_path in required_files:
if not Path(file_path).exists():
print(f"Error: Required file not found: {file_path}", file=sys.stderr)
return False

Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validate_paths() returns False both when the SARIF file is missing (a non-error “skip” condition) and when a required file is missing (a real error). In main() these are treated the same and exit 0, which can hide misconfiguration. Consider returning a tri-state (e.g. enum) or raising on missing required files so CI fails when SARIF exists but config is missing.

Copilot uses AI. Check for mistakes.
Comment thread scripts/tooling/cli/workflow/recategorize_guidelines.py Outdated
Comment thread scripts/tooling/requirements.txt Outdated
Comment thread scripts/tooling/requirements.txt Outdated
Comment thread scripts/tooling/requirements.txt Outdated
Comment thread scripts/tooling/cli/workflow/checkout_repos.py
Comment thread scripts/tooling/requirements.txt Outdated
Comment thread scripts/tooling/requirements.txt Outdated
Comment thread .github/workflows/codeql-multiple-repo-scan.yml Outdated
Comment thread .github/workflows/codeql-multiple-repo-scan.yml Outdated
Comment thread scripts/tooling/cli/workflow/checkout_repos.py Outdated
adding GitPython lib

cleaning the files for redundant

made Codeql executable with bazel

fixing bzel run issues

adding a version

removing timeout

addresses several issues related to environment-specific paths and execution context

removing the ref
@Saumya-R Saumya-R force-pushed the saumya_update_workflow branch from 16f4461 to edcaf22 Compare April 16, 2026 11:43
@PiotrKorkus PiotrKorkus merged commit 213cc8c into eclipse-score:main Apr 16, 2026
17 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

codeql - optimize workflow codeql - exclude from report

3 participants