Skip to content

Conversation

@ofir-frd
Copy link

Benchmark PR PrefectHQ#19489

Type: Clean (correct implementation)

Original PR Title: Add client-side configuration for deployment concurrency grace period
Original PR Description:

Overview

Adds client-side configuration for deployment concurrency grace periods, allowing users to control how long infrastructure has to start before concurrency slots are released. This addresses #19410 and brings OSS feature parity with Prefect Cloud.

Previously, the grace period was hardcoded server-side (300s in OSS). Now users can configure it per-deployment via:

  • prefect.yaml deployments
  • flow.deploy() / flow.serve() with ConcurrencyLimitConfig
  • Python deployment APIs

Changes

Schema & Models

  • Added grace_period_seconds field to ConcurrencyOptions (server schema) and ConcurrencyLimitConfig (client schema)
  • Field is optional with default of 600 seconds (10 minutes)
  • Validated range: 60-86400 seconds (1 minute to 1 day)
  • No database migration required - concurrency_options is already a JSON column

Orchestration

  • Updated SecureFlowConcurrencySlots policy to use configured grace period when creating concurrency leases
  • Handles concurrency_options as both dict and Pydantic model for backward compatibility
  • Falls back to server setting (initial_deployment_lease_duration, default 300s) when concurrency_options is None

CLI & Deployment

  • Added grace_period_seconds to ConcurrencyLimitSpec model
  • Updated _run_single_deploy to extract and pass through grace period from YAML
  • Updated RunnerDeployment to map grace period from ConcurrencyLimitConfig to deployment payload
  • Fixed YAML serialization to exclude None values from concurrency_limit dict

Documentation

  • Updated deployment concepts docs with grace period explanation and examples
  • Updated prefect.yaml reference with grace period field description

Test Coverage

Added 9 tests covering:

  • Boundary validation (5 tests): Min/max/default values for grace_period_seconds
  • YAML deployment mapping (2 tests): Verify grace period flows from YAML to deployment, and None values aren't serialized
  • Orchestration behavior (3 tests): Server setting fallback, model default, and custom grace period

All tests passing locally and in CI (49/51 checks passed, 2 cancelled due to infrastructure issues).

Important Review Points

  1. Fallback behavior: When concurrency_options is None, uses server setting (300s). When concurrency_options exists but grace_period_seconds not set, uses model default (600s). This difference is intentional but worth verifying.

  2. Type handling: Orchestration code handles concurrency_options as both dict and Pydantic model instance. This is necessary for backward compatibility but adds complexity.

  3. YAML serialization: Now filters out None values from concurrency_limit dict to avoid cluttering configs. Verify this doesn't affect other optional fields.

  4. Range validation: 60-86400 seconds (1 minute to 1 day). Confirm this range is appropriate for all use cases.


Link to Devin run: https://app.devin.ai/sessions/6a429e264eee42ffa97cffcdd6e0eada
Requested by: Nate Nowack (@zzstoatzz)

Checklist

@github-actions github-actions bot added the docs label Dec 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants