Fix dead config: scheduler.ignore_first_depends_on_past_by_default#68180
Open
kyungjunleeme wants to merge 4 commits into
Open
Fix dead config: scheduler.ignore_first_depends_on_past_by_default#68180kyungjunleeme wants to merge 4 commits into
kyungjunleeme wants to merge 4 commits into
Conversation
…ignored The `[scheduler] ignore_first_depends_on_past_by_default` option (added in 2.3.0 via apache#22491) became a dead config in Airflow 3: it is still declared in config.yml with default "True", but no code reads it. The Task SDK hardcoded `DEFAULT_IGNORE_FIRST_DEPENDS_ON_PAST = False`, so the regression that apache#22491 fixed came back — adding a new task to an existing DAG whose default_args set `depends_on_past=True` leaves the task in no-status forever (PrevDagrunDep: "previous task instance has not run yet") and the DAG run never completes. Wire the default back to the config, matching how 2.10.5 read it (and how DEFAULT_RETRIES and friends in the same module still read conf). The value flows through OPERATOR_DEFAULTS into the serialization client_defaults, so both regular and mapped operators pick up the configured default on the scheduler side.
The OpenLineage task facet now reports ignore_first_depends_on_past=True on Airflow 3 (matching the AF2 expectations), since the scheduler config default is honored again.
- Add the new ignore_first_depends_on_past=True entry to the serialized DAG client_defaults ground truth (it now differs from the schema default). - Derive the expected ignore_first_depends_on_past in the OpenLineage test_task_info_af3 from the operator instead of hardcoding, so it passes against both older 3.x cores (default False) and 3.2+ (default True).
a94f2c1 to
7bc1f96
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
[scheduler] ignore_first_depends_on_past_by_defaultbecame a dead config in Airflow 3: it is still declared inconfig.ymlwith default"True", but no code reads it. The Task SDK hardcoded the default:so the regression that #22491 fixed in 2.3.0 came back. Adding a new task to an existing DAG whose
default_argssetdepends_on_past=Trueleaves the new task in no-status forever (PrevDagrunDep: "previous task instance has not run yet") and the DAG run never leaves the running state. Agrepforignore_first_depends_on_past_by_defaultacrossmainmatches only theconfig.ymldeclaration — zero readers. (2.10.5 read it atairflow/models/abstractoperator.py.)Fix
Wire the default back to the config, the same way 2.10.5 did and the way
DEFAULT_RETRIESand friends in the same module still do:The value flows through
OPERATOR_DEFAULTSinto the serializationclient_defaults, so both regular and mapped operators pick up the configured default on the scheduler side. The server-side schema default staysFalse, so setting the config toFalsecontinues to disable the behavior (noclient_defaultsentry is emitted in that case).Tests
task-sdk: default isTrue, explicit override wins, and the default follows the config (reload guard against the dead-config regression).airflow-corePrevDagrunDep: a new task withdepends_on_past=Truepasses its first run using only the config default.test_dag_serialization(operator defaults and the serialized-DAGclient_defaultsground truth, which now carriesignore_first_depends_on_past=True) andtest_mappedoperator.ignore_first_depends_on_past=Trueon Airflow 3.2+.test_task_info_af3derives the expected value from the operator so it stays correct across core versions (theCompat 3.0.xprovider jobs run it against older cores where the default is stillFalse); the AF3expected_eventssystem fixtures are updated to match.related: #17585, #22491
Was generative AI tooling used to co-author this PR?
Generated-by: Claude Code (Opus 4.8) following the guidelines