Skip to content

Fix Job arrays exceeding queueSize (#5920) #6345

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 14, 2025

Conversation

pditommaso
Copy link
Member

@pditommaso pditommaso commented Aug 14, 2025

Summary

Fixes issue where job arrays bypassed executor.queueSize limits, causing resource overallocation. Previously, a job array would count as only 1 task against the queue limit regardless of its actual array size.

Before: process.array = 30 + executor.queueSize = 40 → up to 70 concurrent tasks
After: process.array = 30 + executor.queueSize = 40 → max 40 concurrent tasks (respects queue limit)

Root Cause

The original canSubmit() method only checked runningQueue.size() < capacity, treating job arrays as single tasks. This allowed arrays to exceed intended resource limits.

Solution

Core Changes

  • Queue Slot Accounting: Job arrays now consume slots equal to their array size
  • Validation: Throws IllegalArgumentException if array size exceeds total queue capacity
  • Clean Architecture: Extracted complex logic into checkQueueCapacity() helper while preserving original ternary structure

Implementation Details

The checkQueueCapacity() method:

  1. Calculates task slots (1 for regular tasks, array size for job arrays)
  2. Validates array size doesn't exceed total capacity
  3. Checks if remaining capacity can accommodate the task

Test Coverage

Added comprehensive tests using Spock's parameterized testing with 12 new test cases covering:

  • Exception validation for oversized arrays (3 test cases)
  • Queue capacity accounting with various scenarios (9 test cases)
  • Inheritance verification for ParallelPollingMonitor

Backward Compatibility

Fully backward compatible - no breaking changes:

  • Regular tasks continue to work unchanged (consume 1 slot)
  • Unlimited capacity (executor.queueSize = 0) works as before
  • All existing configurations remain valid
  • Only affects oversized job arrays that were previously misconfigured

Related

## Problem
Job arrays bypassed executor.queueSize limits, causing resource overallocation.
When `process.array = 30` and `executor.queueSize = 40`, up to 70 concurrent
tasks could run (40 regular + 30 array tasks) instead of the intended 40.

## Solution
- Modified TaskPollingMonitor.canSubmit() to account for array size in capacity calculations
- Job arrays now consume slots equal to their array size (not just 1)
- Added validation that throws IllegalArgumentException if array size exceeds queue capacity
- Extracted complex logic into checkQueueCapacity() helper method to maintain clean ternary structure

## Changes
- **TaskPollingMonitor.groovy**: Added getTaskSlots() and checkQueueCapacity() helper methods
- **TaskPollingMonitorTest.groovy**: Added comprehensive parameterized tests using Spock @unroll
- **ParallelPollingMonitorTest.groovy**: Added inheritance test to verify validation works in subclass

## Test Coverage
- Exception validation for oversized arrays (3 test cases)
- Queue capacity accounting with various scenarios (9 test cases)
- Edge cases: unlimited capacity, process constraints, task readiness
- Inheritance verification for ParallelPollingMonitor

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Copy link

netlify bot commented Aug 14, 2025

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit 85d275a
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs-staging/deploys/689e0ef858134c0008f1dbb3

@pditommaso pditommaso merged commit ea748de into master Aug 14, 2025
22 checks passed
@pditommaso pditommaso deleted the fix-job-arrays-queue-size-5920 branch August 14, 2025 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Job Array with queueSize exceeds queueSize
1 participant