Skip to content

Conversation

@robsyme
Copy link
Collaborator

@robsyme robsyme commented Dec 11, 2025

Summary

  • Transform leading **/ glob patterns to {,**/} in FileHelper.visitFiles() so they match files at the root level (zero directories) as well as in subdirectories

Test plan

  • Added test case should match files at root and subdirectories with double-star glob
  • Verified existing visitFiles tests still pass
  • Run full test suite

Closes #6653

@netlify
Copy link

netlify bot commented Dec 11, 2025

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit 2ab4883
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs-staging/deploys/693b18a0583f9a000840c93e

@robsyme
Copy link
Collaborator Author

robsyme commented Dec 11, 2025

@claude review please

@claude
Copy link

claude bot commented Dec 11, 2025

Claude finished @robsyme's task —— View job


PR Review: Fix leading **/ glob to match zero-or-more directories

Todo List:

  • Read and analyze the modified FileHelper.groovy
  • Review the test changes in FileHelperTest.groovy
  • Analyze the glob pattern transformation logic
  • Check test coverage and edge cases
  • Provide comprehensive review feedback

Review Summary

This PR successfully addresses issue #6653 by transforming leading **/ glob patterns to {,**/} to match files at both the root level (zero directories) and in subdirectories. The implementation is clean, well-tested, and follows Java glob expansion conventions.

Code Analysis

Core Change (FileHelper.groovy:865-867)

final adjustedPattern = filePattern.startsWith('**/')
    ? "{,**/}" + filePattern.substring(3)
    : filePattern

✅ Strengths:

  • Clear and correct logic: Transforms **/foo.bin{,**/}foo.bin which expands to match both foo.bin and **/foo.bin
  • Minimal scope: Only affects patterns that start with **/, preserving existing behavior for other patterns
  • Efficient: Simple string manipulation with no regex overhead
  • Well-documented: Clear comment explains the Java glob behavior being addressed

Test Coverage (FileHelperTest.groovy:627-667)

✅ Comprehensive test case:

  • Tests files at root level: data.bin
  • Tests files in subdirectory: subdir/data.bin
  • Tests files in nested subdirectory: subdir/nested/data.bin
  • Tests complex patterns: **/InterOp/*.bin matching at multiple levels
  • Verifies expected behavior with proper assertions

Technical Assessment

Pattern Transformation Logic

The transformation **/pattern{,**/}pattern is mathematically sound:

  • {,**/} expands to two alternatives: empty string and **/
  • This creates the union of matches: files at root + files in subdirectories
  • Follows Java glob brace expansion semantics correctly

Edge Cases Considered

  • ✅ Only transforms patterns starting with **/
  • ✅ Preserves patterns like foo/**/bar (no leading **)
  • ✅ Handles complex nested patterns correctly
  • ✅ Maintains compatibility with existing maxDepth logic

Performance Impact

  • Minimal: Single string check and substring operation
  • No regression: Only affects the specific problematic patterns

Potential Concerns

  1. Pattern complexity: The transformation creates brace expansion which could theoretically impact some file systems, but this is standard Java glob syntax
  2. Documentation: Consider adding a note about this behavior in relevant documentation
  3. Backwards compatibility: This change is additive - existing working patterns continue to work, and previously broken patterns now work correctly

Recommendations

✅ APPROVE - This is a well-implemented fix that:

  • Solves the reported issue correctly
  • Has comprehensive test coverage
  • Uses efficient, readable code
  • Maintains backward compatibility
  • Follows Java glob conventions properly

The implementation is production-ready and addresses the core issue without introducing risks to existing functionality.

Java's glob pattern `**/foo` requires at least one directory component
before `foo`, but users expect it to also match `foo` at the root level.

This change transforms leading `**/` patterns to `{,**/}` which means
"either nothing OR any number of directories", correctly matching:
- `foo` (zero directories)
- `bar/foo` (one directory)
- `bar/baz/foo` (multiple directories)

This fixes issues where `files("path/**/subdir/*.txt")` would fail to
find files in `path/subdir/` while finding files in `path/x/subdir/`.

Fixes #5948

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Signed-off-by: Rob Syme <[email protected]>
@robsyme robsyme force-pushed the fix-double-star-glob branch from b2a0ab4 to 2ab4883 Compare December 11, 2025 19:16
robsyme added a commit to nf-core/demultiplex that referenced this pull request Dec 11, 2025
Extract InterOp files directly from the input channel rather than
emitting them from BCLCONVERT/BCL2FASTQ process outputs. This avoids
passing files through a process unchanged, allowing downstream consumers
to access InterOp data immediately without waiting for demultiplexing
to complete.

This also resolves an issue where Fusion could not collect InterOp files
nested within staged input directories (nextflow-io/nextflow#5948).

The {,**/} glob pattern is used instead of **/ because Java's glob
requires **/ to match at least one directory, but we need to match
InterOp directories at both the root level and in subdirectories.
Once nextflow-io/nextflow#6654 is merged, this can be simplified to **/.
robsyme added a commit to nf-core/modules that referenced this pull request Dec 11, 2025
…ficiency

Extract InterOp files directly from the input channel rather than
emitting them from BCLCONVERT/BCL2FASTQ process outputs. This avoids
passing files through a process unchanged, allowing downstream consumers
to access InterOp data immediately without waiting for demultiplexing
to complete.

This also resolves an issue where Fusion could not collect InterOp files
nested within staged input directories (nextflow-io/nextflow#5948).

The {,**/} glob pattern is used instead of **/ because Java's glob
requires **/ to match at least one directory, but we need to match
InterOp directories at both the root level and in subdirectories.
Once nextflow-io/nextflow#6654 is merged, this can be simplified to **/.

Changes:
- Remove interop output from BCLCONVERT and BCL2FASTQ modules
- Remove redundant cp commands that copied InterOp files
- Update meta.yml files to reflect removed outputs
- Extract InterOp files from input channel in BCL_DEMULTIPLEX subworkflow

Signed-off-by: Rob Syme <[email protected]>
@pditommaso
Copy link
Member

Closing since stating #6653 is not planned anymore

@pditommaso pditommaso closed this Dec 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Leading **/ glob pattern does not match files at root level

3 participants