Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 31, 2025

The push_repo_memory job failed with "File does not match allowed patterns: history.jsonl" despite the glob filter ["*.json", "*.jsonl", "*.csv", "*.md"] being configured. Pattern matching logic is correct (*.jsonl^[^/]*\.jsonl$ matches history.jsonl), suggesting the failure stems from environment variable propagation, path formatting, or timing issues.

Changes

Environment diagnostics

  • Log FILE_GLOB_FILTER value, length, and pattern count at startup
  • Log all relevant env vars (MEMORY_ID, MAX_FILE_SIZE, MAX_FILE_COUNT)

Pattern matching visibility

  • Debug-level logs showing per-file pattern matching with compiled regexes
  • Enhanced error messages displaying each pattern's regex and match result
  • File scan summary listing discovered files and sizes

Code refactoring

  • Created separate glob_pattern_helpers.cjs module with three exported functions:
    • globPatternToRegex(pattern): Convert glob pattern to RegExp
    • parseGlobPatterns(fileGlobFilter): Parse space-separated glob patterns
    • matchesGlobPattern(filePath, fileGlobFilter): Check if file matches any pattern
  • Added comprehensive JSDoc documentation with examples
  • Updated push_repo_memory.cjs to import from the helper module
  • Module is automatically copied by setup.sh (copies all .cjs files)

Test coverage

  • Added 5 test cases for debug logging behavior
  • Added 37 comprehensive test cases in glob_pattern_helpers.test.cjs:
    • Basic pattern matching (exact filenames, *, **)
    • Special character escaping (dots, backslashes)
    • Real-world patterns (*.jsonl, metrics/**, campaign-specific patterns)
    • Edge cases (empty patterns, complex nested patterns, case sensitivity)
    • Security tests (ReDoS prevention, proper escaping)
    • Helper function tests (parseGlobPatterns, matchesGlobPattern)
    • Integration tests with push_repo_memory use cases
  • All 87 tests passing (50 push_repo_memory + 37 glob_pattern_helpers) ✓
  • Validated root-level file matching (daily-code-metrics scenario)
  • Verified error message formatting includes actionable details

Example output

When a file fails validation, the error now shows:

File does not match allowed patterns: history.jsonl
Allowed patterns: *.json *.jsonl *.csv *.md
Pattern test results:
  *.json -> regex: ^[^/]*\.json$ -> NO MATCH
  *.jsonl -> regex: ^[^/]*\.jsonl$ -> MATCH

Debug logs for each file:

Testing file: history.jsonl
  Pattern 2: "*.jsonl" -> ^[^/]*\.jsonl$ -> ✓ MATCH

Next workflow run will immediately reveal the root cause.

Original prompt

Reference: https://github.com/githubnext/gh-aw/actions/runs/20610342090/job/59193817350#step:7:1

Investigate why the file history.jsonl did not match the glob filter. Add tests and logging.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

- Add debug logging to show file glob filter configuration
- Add detailed pattern matching logs for each file
- Add enhanced error messages with regex patterns and test results
- Add new tests for debug logging functionality
- Add tests for root-level file matching (daily-code-metrics use case)

Co-authored-by: pelikhan <[email protected]>
.replace(/\*\*/g, "<!DOUBLESTAR>") // Temporarily replace **
.replace(/\*/g, "[^/]*") // Single * matches non-slash chars
.replace(/<!DOUBLESTAR>/g, ".*"); // ** matches everything including /
return new RegExp(`^${regexPattern}$`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot move pattern to regex helper to separate function and add tests to validate behavior

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in commit 1e498c4. Extracted globPatternToRegex() as a standalone function with full JSDoc documentation and added 20 comprehensive test cases covering:

  • Basic patterns (exact, *, **)
  • Special character escaping
  • Real-world scenarios (*.jsonl, metrics/**, campaigns)
  • Edge cases (empty, complex nested patterns)
  • Regex output validation

All 50 tests passing ✓

- Log FILE_GLOB_FILTER environment variable value and length
- Log MEMORY_ID, MAX_FILE_SIZE, MAX_FILE_COUNT for debugging
- Add summary of files found after scanning
- Show first 10 files or all files if <= 10 found

Co-authored-by: pelikhan <[email protected]>
Copilot AI changed the title [WIP] Investigate file history.jsonl glob filter mismatch Add debug logging for repo-memory glob filter validation failures Dec 31, 2025
Copilot AI requested a review from pelikhan December 31, 2025 02:43
- Move glob-to-regex conversion logic to standalone function
- Export globPatternToRegex for testing and reusability
- Add 20 new test cases covering:
  - Basic pattern matching (exact, *, **)
  - Special character escaping (dots, backslashes)
  - Real-world patterns (*.jsonl, metrics/**, campaigns)
  - Edge cases (empty, wildcards only, complex nested)
  - Regex output format validation
- All 50 tests passing

Co-authored-by: pelikhan <[email protected]>
Copilot AI changed the title Add debug logging for repo-memory glob filter validation failures Add debug logging and extract glob pattern helper for repo-memory validation Dec 31, 2025
* - . is escaped to match literal dots
* - \ is escaped properly
*/
function globPatternToRegex(pattern) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot move to separate file, add more tests, update setup.sh

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in commit 9be23c5. Created separate glob_pattern_helpers.cjs module with:

  • 3 exported functions: globPatternToRegex(), parseGlobPatterns(), matchesGlobPattern()
  • 37 comprehensive tests covering patterns, escaping, edge cases, security, and integration
  • Updated push_repo_memory.cjs to import from the helper
  • All 87 tests passing ✓

Setup.sh automatically copies all .cjs files, so the new module will be available at runtime without changes to setup.sh.

- Created glob_pattern_helpers.cjs with 3 exported functions:
  - globPatternToRegex: Convert glob pattern to RegExp
  - parseGlobPatterns: Parse space-separated patterns
  - matchesGlobPattern: Check if file matches any pattern
- Added 37 comprehensive tests in glob_pattern_helpers.test.cjs:
  - Basic pattern matching (exact, *, **)
  - Special character escaping
  - Real-world patterns (daily-code-metrics, campaigns)
  - Edge cases and security tests
  - Integration tests with push_repo_memory
- Updated push_repo_memory.cjs to import from helper module
- Updated push_repo_memory.test.cjs to import from helper module
- All 87 tests passing (50 push_repo_memory + 37 glob_pattern_helpers)

Co-authored-by: pelikhan <[email protected]>
Copilot AI changed the title Add debug logging and extract glob pattern helper for repo-memory validation Add debug logging and extract glob pattern helper module for repo-memory validation Dec 31, 2025
Copilot AI requested a review from pelikhan December 31, 2025 03:14
@pelikhan pelikhan marked this pull request as ready for review December 31, 2025 03:27
@pelikhan
Copy link
Contributor

@copilot Lint js

Copy link
Contributor Author

Copilot AI commented Dec 31, 2025

@copilot Lint js

All JavaScript files pass linting ✓

  • make fmt-cjs: All files properly formatted (no changes needed)
  • make lint-cjs: All files pass Prettier checks
  • All 87 tests passing (50 push_repo_memory + 37 glob_pattern_helpers)

@pelikhan pelikhan merged commit d9ae817 into main Dec 31, 2025
97 of 99 checks passed
@pelikhan pelikhan deleted the copilot/investigate-file-history-glob-issue branch December 31, 2025 03:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants