Releases: apache/airflow
Apache Airflow 2.7.2
Significant Changes
No significant changes
Bug Fixes
- Check if the lower of provided values are sensitives in config endpoint (#34712)
- Add support for ZoneInfo and generic UTC to fix datetime serialization (#34683, #34804)
- Fix AttributeError: 'Select' object has no attribute 'count' during the airflow db migrate command (#34348)
- Make dry run optional for patch task instance (#34568)
- Fix non deterministic datetime deserialization (#34492)
- Use iterative loop to look for mapped parent (#34622)
- Fix is_parent_mapped value by checking if any of the parent
taskgroup
is mapped (#34587) - Avoid top-level airflow import to avoid circular dependency (#34586)
- Add more exemptions to lengthy metric list (#34531)
- Fix dag warning endpoint permissions (#34355)
- Fix task instance access issue in the batch endpoint (#34315)
- Correcting wrong time showing in grid view (#34179)
- Fix www
cluster_activity
view not loading due tostandaloneDagProcessor
templating (#34274) - Set
loglevel=DEBUG
in 'Not syncingDAG-level
permissions' (#34268) - Make param validation consistent for DAG validation and triggering (#34248)
- Ensure details panel is shown when any tab is selected (#34136)
- Fix issues related to
access_control={}
(#34114) - Fix not found
ab_user
table in the CLI session (#34120) - Fix FAB-related logging format interpolation (#34139)
- Fix query bug in
next_run_datasets_summary
endpoint (#34143) - Fix for TaskGroup toggles for duplicated labels (#34072)
- Fix the required permissions to clear a TI from the UI (#34123)
- Reuse
_run_task_session
in mappedrender_template_fields
(#33309) - Fix scheduler logic to plan new dag runs by ignoring manual runs (#34027)
- Add missing audit logs for Flask actions add, edit and delete (#34090)
- Hide Irrelevant Dag Processor from Cluster Activity Page (#33611)
- Remove infinite animation for pinwheel, spin for 1.5s (#34020)
- Restore rendering of provider configuration with
version_added
(#34011)
Doc Only Changes
- Clarify audit log permissions (#34815)
- Add explanation for Audit log users (#34814)
- Import
AUTH_REMOTE_USER
from FAB in WSGI middleware example (#34721) - Add information about drop support MsSQL as DB Backend in the future (#34375)
- Document how to use the system's timezone database (#34667)
- Clarify what landing time means in doc (#34608)
- Fix screenshot in dynamic task mapping docs (#34566)
- Fix class reference in Public Interface documentation (#34454)
- Clarify var.value.get and var.json.get usage (#34411)
- Schedule default value description (#34291)
- Docs for triggered_dataset_event (#34410)
- Add DagRun events (#34328)
- Provide tabular overview about trigger form param types (#34285)
- Add link to Amazon Provider Configuration in Core documentation (#34305)
- Add "security infrastructure" paragraph to security model (#34301)
- Change links to SQLAlchemy 1.4 (#34288)
- Add SBOM entry in security documentation (#34261)
- Added more example code for XCom push and pull (#34016)
- Add state utils to Public Airflow Interface (#34059)
- Replace markdown style link with rst style link (#33990)
- Fix broken link to the "UPDATING.md" file (#33583)
Misc/Internal
- Update min-sqlalchemy version to account for latest features used (#34293)
- Fix SesssionExemptMixin spelling (#34696)
- Restrict
astroid
version < 3 (#34658) - Fail dag test if defer without triggerer (#34619)
- Fix connections exported output (#34640)
- Don't run isort when creating new alembic migrations (#34636)
- Deprecate numeric type python version in PythonVirtualEnvOperator (#34359)
- Refactor
os.path.splitext
toPath.*
(#34352, #33669) - Replace = by is for type comparison (#33983)
- Refactor integer division (#34180)
- Refactor: Simplify comparisons (#34181)
- Refactor: Simplify string generation (#34118)
- Replace unnecessary dict comprehension with dict() in core (#33858)
- Change "not all" to "any" for ease of readability (#34259)
- Replace assert by if...raise in code (#34250, #34249)
- Move default timezone to except block (#34245)
- Combine similar if logic in core (#33988)
- Refactor: Consolidate import and usage of random (#34108)
- Consolidate importing of os.path.* (#34060)
- Replace sequence concatenation by unpacking in Airflow core (#33934)
- Refactor unneeded 'continue' jumps around the repo (#33849, #33845, #33846, #33848, #33839, #33844, #33836, #33842)
- Remove [project] section from
pyproject.toml
(#34014) - Move the try outside the loop when this is possible in Airflow core (#33975)
- Replace loop by any when looking for a positive value in core (#33985)
- Do not create lists we don't need (#33519)
- Remove useless string join from core (#33969)
- Add TCH001 and TCH002 rules to pre-commit to detect and move type checking modules (#33865)
- Add cancel_trigger_ids to to_cancel dequeue in batch (#33944)
- Avoid creating unnecessary list when parsing stats datadog tags (#33943)
- Replace dict.items by dict.values when key is not used in core (#33940)
- Replace lambdas with comprehensions (#33745)
- Improve modules import in Airflow core by some of them into a type-checking block (#33755)
- Refactor: remove unused state - SHUTDOWN (#33746, #34063, #33893)
- Refactor: Use in-place .sort() (#33743)
- Use literal dict instead of calling dict() in Airflow core (#33762)
- remove unnecessary map and rewrite it using list in Airflow core (#33764)
- Replace lambda by a def method in Airflow core (#33758)
- Replace type func by
isinstance
in fab_security manager (#33760) - Replace single quotes by double quotes in all Airflow modules (#33766)
- Merge multiple
isinstance
calls for the same object in a single call (#33767) - Use a single statement with multiple contexts instead of nested statements in core (#33769)
- Refactor: Use f-strings (#33734, #33455)
- Refactor: Use random.choices (#33631)
- Use
str.splitlines()
to split lines (#33592) - Refactor: Remove useless str() calls (#33629)
- Refactor: Improve detection of duplicates and list sorting (#33675)
- Simplify conditions on
len()
(#33454)
Apache Airflow Helm Chart 1.11.0
Significant Changes
Support naming customization on helm chart resources, some resources may be renamed during upgrade (#31066)
This is a new opt-in switch useStandardNaming
, for backwards compatibility, to leverage the standard naming convention, which allows full use of fullnameOverride
and nameOverride
in all resources.
The following resources will be renamed using default of useStandardNaming=false
when upgrading to 1.11.0 or a higher version.
- ConfigMap
{release}-airflow-config
to{release}-config
- Secret
{release}-airflow-metadata
to{release}-metadata
- Secret
{release}-airflow-result-backend
to{release}-result-backend
- Ingress
{release}-airflow-ingress
to{release}-ingress
For existing installations, all your resources will be recreated with a new name and Helm will delete the previous resources.
This won't delete existing PVCs for logs used by StatefulSet/Deployments, but it will recreate them with brand new PVCs.
If you do want to preserve logs history you'll need to manually copy the data of these volumes into the new volumes after
deployment. Depending on what storage backend/class you're using this procedure may vary. If you don't mind starting
with fresh logs/redis volumes, you can just delete the old PVCs that will be names, for example:
kubectl delete pvc -n airflow logs-gta-triggerer-0
kubectl delete pvc -n airflow logs-gta-worker-0
kubectl delete pvc -n airflow redis-db-gta-redis-0
If you do not change useStandardNaming
or fullnameOverride
after upgrade, you can proceed as usual and no unexpected behaviours will be presented.
bitnami/postgresql
subchart updated to 12.10.0
(#33747)
The PostgreSQL subchart that is used with the Chart is now 12.10.0
, previously it was 12.1.9
.
Default git-sync image is updated to 3.6.9
(#33748)
The default git-sync image that is used with the Chart is now 3.6.9
, previously it was 3.6.3
.
Default Airflow image is updated to 2.7.1
(#34186)
The default Airflow image that is used with the Chart is now 2.7.1
, previously it was 2.6.2
.
New Features
- Add support for scheduler name to PODs templates (#33843)
- Support KEDA scaling for triggerer (#32302)
- Add support for container lifecycle hooks (#32349, #34677)
- Support naming customization on helm chart resources (#31066)
- Adding
startupProbe
to scheduler and webserver (#33107) - Allow disabling token mounts using
automountServiceAccountToken
(#32808) - Add support for defining custom priority classes (#31615)
- Add support for
runtimeClassName
(#31868) - Add support for custom query in workers KEDA trigger (#32308)
Improvements
- Add
containerSecurityContext
for cleanup job (#34351) - Add existing secret support for PGBouncer metrics exporter (#32724)
- Allow templating in webserver ingress hostnames (#33142)
- Allow templating in flower ingress hostnames (#33363)
- Add configmap annotations to StatsD and webserver (#33340)
- Add pod security context to PgBouncer (#32662)
- Add an option to use a direct DB connection in KEDA when PgBouncer is enabled (#32608)
- Allow templating in cleanup.schedule (#32570)
- Template dag processor
waitformigration
containersextraVolumeMounts
(#32100) - Ability to inject extra containers into PgBouncer (#33686)
- Allowing ability to add custom env into PgBouncer container (#33438)
- Add support for env variables in the StatsD container (#33175)
Bug Fixes
- Add
airflow db migrate
command to database migration job (#34178) - Pass
workers.terminationGracePeriodSeconds
into KubeExecutor pod template (#33514) - CeleryExecutor namespace depends on Airflow version (#32753)
- Fix dag processor not including webserver config volume (#32644)
- Dag processor liveness probe include
--local
and--job-type
args (#32426) - Revising flower_url_prefix considering default value (#33134)
Doc only changes
- Add more explicit "embedded postgres" exclusion for production (#33034)
- Update git-sync description (#32181)
Misc
- Default Airflow version to 2.7.1 (#34186)
- Update PostgreSQL subchart to 12.10.0 (#33747)
- Update git-sync to 3.6.9 (#33748)
- Remove unnecessary loops to load env from helm values (#33506)
- Replace
common.tplvalues.render
withtpl
in ingress template files (#33384) - Remove K8S 1.23 support (#32899)
- Fix chart named template comments (#32681)
- Remove outdated comment from chart values in the workers KEDA conf section (#32300)
- Remove unnecessary
or
function in template files (#34415)
Apache Airflow 2.7.1
Significant Changes
CronTriggerTimetable is now less aggressive when trying to skip a run (#33404)
When setting catchup=False
, CronTriggerTimetable no longer skips a run if
the scheduler does not query the timetable immediately after the previous run
has been triggered.
This should not affect scheduling in most cases, but can change the behaviour if
a DAG is paused-unpaused to manually skip a run. Previously, the timetable (with
catchup=False
) would only start a run after a DAG is unpaused, but with this
change, the scheduler would try to look at little bit back to schedule the
previous run that covers a part of the period when the DAG was paused. This
means you will need to keep a DAG paused longer (namely, for the entire cron
period to pass) to really skip a run.
Note that this is also the behaviour exhibited by various other cron-based
scheduling tools, such as anacron
.
conf.set()
becomes case insensitive to match conf.get()
behavior (#33452)
Also, conf.get()
will now break if used with non-string parameters.
conf.set(section, key, value)
used to be case sensitive, i.e. conf.set("SECTION", "KEY", value)
and conf.set("section", "key", value)
were stored as two distinct configurations.
This was inconsistent with the behavior of conf.get(section, key)
, which was always converting the section and key to lower case.
As a result, configuration options set with upper case characters in the section or key were unreachable.
That's why we are now converting section and key to lower case in conf.set
too.
We also changed a bit the behavior of conf.get()
. It used to allow objects that are not strings in the section or key.
Doing this will now result in an exception. For instance, conf.get("section", 123)
needs to be replaced with conf.get("section", "123")
.
Bug Fixes
- Ensure that tasks wait for running indirect setup (#33903)
- Respect "soft_fail" for core async sensors (#33403)
- Differentiate 0 and unset as a default param values (#33965)
- Raise 404 from Variable PATCH API if variable is not found (#33885)
- Fix
MappedTaskGroup
tasks not respecting upstream dependency (#33732) - Add limit 1 if required first value from query result (#33672)
- Fix UI DAG counts including deleted DAGs (#33778)
- Fix cleaning zombie RESTARTING tasks (#33706)
SECURITY_MANAGER_CLASS
should be a reference to class, not a string (#33690)- Add back
get_url_for_login
in security manager (#33660) - Fix
2.7.0 db
migration job errors (#33652) - Set context inside templates (#33645)
- Treat dag-defined access_control as authoritative if defined (#33632)
- Bind engine before attempting to drop archive tables (#33622)
- Add a fallback in case no first name and last name are set (#33617)
- Sort data before
groupby
in TIS duration calculation (#33535) - Stop adding values to rendered templates UI when there is no dagrun (#33516)
- Set strict to True when parsing dates in webserver views (#33512)
- Use
dialect.name
in custom SA types (#33503) - Do not return ongoing dagrun when a
end_date
is less thanutcnow
(#33488) - Fix a bug in
formatDuration
method (#33486) - Make
conf.set
case insensitive (#33452) - Allow timetable to slightly miss catchup cutoff (#33404)
- Respect
soft_fail
argument whenpoke
is called (#33401) - Create a new method used to resume the task in order to implement specific logic for operators (#33424)
- Fix DagFileProcessor interfering with dags outside its
processor_subdir
(#33357) - Remove the unnecessary
<br>
text in Provider's view (#33326) - Respect
soft_fail
argument when ExternalTaskSensor runs in deferrable mode (#33196) - Fix handling of default value and serialization of Param class (#33141)
- Check if the dynamically-added index is in the table schema before adding (#32731)
- Fix rendering the mapped parameters when using
expand_kwargs
method (#32272) - Fix dependencies for celery and opentelemetry for Python 3.8 (#33579)
Misc/Internal
- Bring back
Pydantic
1 compatibility (#34081, #33998) - Use a trimmed version of README.md for PyPI (#33637)
- Upgrade to
Pydantic
2 (#33956) - Reorganize
devel_only
extra in Airflow's setup.py (#33907) - Bumping
FAB
to4.3.4
in order to fix issues with filters (#33931) - Add minimum requirement for
sqlalchemy to 1.4.24
(#33892) - Update version_added field for configs in config file (#33509)
- Replace
OrderedDict
with plain dict (#33508) - Consolidate import and usage of itertools (#33479)
- Static check fixes (#33462)
- Import utc from datetime and normalize its import (#33450)
- D401 Support (#33352, #33339, #33337, #33336, #33335, #33333, #33338)
- Fix some missing type hints (#33334)
- D205 Support - Stragglers (#33301, #33298, #33297)
- Refactor: Simplify code (#33160, #33270, #33268, #33267, #33266, #33264, #33292, #33453, #33476, #33567,
#33568, #33480, #33753, #33520, #33623) - Fix
Pydantic
warning aboutorm_mode
rename (#33220) - Add MySQL 8.1 to supported versions. (#33576)
- Remove
Pydantic
limitation for version < 2 (#33507)
Doc only changes
- Add documentation explaining template_ext (and how to override it) (#33735)
- Explain how users can check if python code is top-level (#34006)
- Clarify that DAG authors can also run code in DAG File Processor (#33920)
- Fix broken link in Modules Management page (#33499)
- Fix secrets backend docs (#33471)
- Fix config description for base_log_folder (#33388)
Apache Airflow 2.7.0
Significant Changes
Remove Python 3.7 support (#30963)
As of now, Python 3.7 is no longer supported by the Python community.
Therefore, to use Airflow 2.7.0, you must ensure your Python version is
either 3.8, 3.9, 3.10, or 3.11.
Old Graph View is removed (#32958)
The old Graph View is removed. The new Graph View is the default view now.
The trigger UI form is skipped in web UI if no parameters are defined in a DAG (#33351)
If you are using dag_run.conf
dictionary and web UI JSON entry to run your DAG you should either:
Add params to your DAG <https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/params.html#use-params-to-provide-a-trigger-ui-form>
_- Enable the new configuration
show_trigger_form_if_no_params
to bring back old behaviour
The "db init", "db upgrade" commands and "[database] load_default_connections" configuration options are deprecated (#33136).
Instead, you should use "airflow db migrate" command to create or upgrade database. This command will not create default connections.
In order to create default connections you need to run "airflow connections create-default-connections" explicitly,
after running "airflow db migrate".
In case of SMTP SSL connection, the context now uses the "default" context (#33070)
The "default" context is Python's default_ssl_contest
instead of previously used "none". The
default_ssl_context
provides a balance between security and compatibility but in some cases,
when certificates are old, self-signed or misconfigured, it might not work. This can be configured
by setting "ssl_context" in "email" configuration of Airflow.
Setting it to "none" brings back the "none" setting that was used in Airflow 2.6 and before,
but it is not recommended due to security reasons ad this setting disables validation of certificates and allows MITM attacks.
Disable default allowing the testing of connections in UI, API and CLI(#32052)
For security reasons, the test connection functionality is disabled by default across Airflow UI,
API and CLI. The availability of the functionality can be controlled by the
test_connection
flag in the core
section of the Airflow
configuration (airflow.cfg
). It can also be controlled by the
environment variable AIRFLOW__CORE__TEST_CONNECTION
.
The following values are accepted for this config param:
Disabled
: Disables the test connection functionality and
disables the Test Connection button in the UI.
This is also the default value set in the Airflow configuration.
2. Enabled
: Enables the test connection functionality and
activates the Test Connection button in the UI.
Hidden
: Disables the test connection functionality and
hides the Test Connection button in UI.
For more information on capabilities of users, see the documentation:
https://airflow.apache.org/docs/apache-airflow/stable/security/security_model.html#capabilities-of-authenticated-ui-users
It is strongly advised to not enable the feature until you make sure that only
highly trusted UI/API users have "edit connection" permissions.
The xcomEntries
API disables support for the deserialize
flag by default (#32176)
For security reasons, the /dags/*/dagRuns/*/taskInstances/*/xcomEntries/*
API endpoint now disables the deserialize
option to deserialize arbitrary
XCom values in the webserver. For backward compatibility, server admins may set
the [api] enable_xcom_deserialize_support
config to True to enable the
flag and restore backward compatibility.
However, it is strongly advised to not enable the feature, and perform
deserialization at the client side instead.
Change of the default Celery application name (#32526)
Default name of the Celery application changed from airflow.executors.celery_executor
to airflow.providers.celery.executors.celery_executor
.
You should change both your configuration and Health check command to use the new name:
- in configuration (
celery_app_name
configuration incelery
section) useairflow.providers.celery.executors.celery_executor
- in your Health check command use
airflow.providers.celery.executors.celery_executor.app
The default value for scheduler.max_tis_per_query
is changed from 512 to 16 (#32572)
This change is expected to make the Scheduler more responsive.
scheduler.max_tis_per_query
needs to be lower than core.parallelism
.
If both were left to their default value previously, the effective default value of scheduler.max_tis_per_query
was 32
(because it was capped at core.parallelism
).
To keep the behavior as close as possible to the old config, one can set scheduler.max_tis_per_query = 0
,
in which case it'll always use the value of core.parallelism
.
Some executors have been moved to corresponding providers (#32767)
In order to use the executors, you need to install the providers:
- for Celery executors you need to install
apache-airflow-providers-celery
package >= 3.3.0 - for Kubernetes executors you need to install
apache-airflow-providers-cncf-kubernetes
package >= 7.4.0 - For Dask executors you need to install
apache-airflow-providers-daskexecutor
package in any version
You can achieve it also by installing airflow with [celery]
, [cncf.kubernetes]
, [daskexecutor]
extras respectively.
Users who base their images on the apache/airflow
reference image (not slim) should be unaffected - the base
reference image comes with all the three providers installed.
Improvement Changes
PostgreSQL only improvement: Added index on taskinstance table (#30762)
This index seems to have great positive effect in a setup with tens of millions such rows.
New Features
- Add OpenTelemetry to Airflow AIP-49
- Trigger Button - Implement Part 2 of AIP-50 (#31583)
- Removing Executor Coupling from Core Airflow AIP-51
- Automatic setup and teardown tasks AIP-52
- OpenLineage in Airflow AIP-53
- Experimental: Add a cache to Variable and Connection when called at dag parsing time (#30259)
- Enable pools to consider deferred tasks (#32709)
- Allows to choose SSL context for SMTP connection (#33070)
- New gantt tab (#31806)
- Load plugins from providers (#32692)
- Add
BranchExternalPythonOperator
(#32787, #33360) - Add option for storing configuration description in providers (#32629)
- Introduce Heartbeat Parameter to Allow
Per-LocalTaskJob
Configuration (#32313) - Add Executors discovery and documentation (#32532)
- Add JobState for job state constants (#32549)
- Add config to disable the 'deserialize' XCom API flag (#32176)
- Show task instance in web UI by custom operator name (#31852)
- Add default_deferrable config (#31712)
- Introducing
AirflowClusterPolicySkipDag
exception (#32013) - Use
reactflow
for datasets graph (#31775) - Add an option to load the dags from db for command tasks run (#32038)
- Add version of
chain
which doesn't require matched lists (#31927) - Use operator_name instead of task_type in UI (#31662)
- Add
--retry
and--retry-delay
toairflow db check
(#31836) - Allow skipped task state task_instance_schema.py (#31421)
- Add a new config for celery result_backend engine options (#30426)
- UI Add Cluster Activity Page (#31123, #32446)
- Adding keyboard shortcuts to common actions (#30950)
- Adding more information to kubernetes executor logs (#29929)
- Add support for configuring custom alembic file (#31415)
- Add running and failed status tab for DAGs on the UI (#30429)
- Add multi-select, proposals and labels for trigger form (#31441)
- Making webserver config customizable (#29926)
- Render DAGCode in the Grid View as a tab (#31113)
- Add rest endpoint to get option of configuration (#31056)
- Add
section
query param in get config rest API (#30936) - Create metrics to track
Scheduled->Queued->Running
task state transition times (#30612) - Mark Task Groups as Success/Failure (#30478)
- Add CLI command to list the provider trigger info (#30822)
- Add Fail Fast feature for DAGs (#29406)
Improvements
- Improve graph nesting logic (#33421)
- Configurable health check threshold for triggerer (#33089, #33084)
- add dag_run_ids and task_ids filter for the batch task instance API endpoint (#32705)
- Ensure DAG-level references are filled on unmap (#33083)
- Add support for arrays of different data types in the Trigger Form UI (#32734)
- Always show gantt and code tabs (#33029)
- Move listener success hook to after SQLAlchemy commit (#32988)
- Rename
db upgrade
todb migrate
and addconnections create-default-connections
(#32810, #33136) - Remove old gantt chart and redirect to grid views gantt tab (#32908)
- Adjust graph zoom based on selected task (#32792)
- Call listener on_task_instance_running after rendering templates (#32716)
- Display execution_date in graph view task instance tooltip. (#32527)
- Allow configuration to be contributed by providers (#32604, #32755, #32812)
- Reduce default for max TIs per query, enforce
<=
parallelism (#32572) - Store config description in Airflow configuration object (#32669)
- Use
isdisjoint
instead ofnot intersection
(#32616) - Speed up calculation of leaves and roots for task groups (#32592)
- Kubernetes Executor Load Time Optimizations (#30727)
- Save DAG parsing time if dag is not schedulable ...
Apache Airflow 2.6.3
Bug Fixes
- Use linear time regular expressions (#32303)
- Fix triggerers alive check and add a new conf for triggerer heartbeat rate (#32123)
- Catch the exception that triggerer initialization failed (#31999)
- Hide sensitive values from extra in connection edit form (#32309)
- Sanitize
DagRun.run_id
and allow flexibility (#32293) - Add triggerer canceled log (#31757)
- Fix try number shown in the task view (#32361)
- Retry transactions on occasional deadlocks for rendered fields (#32341)
- Fix behaviour of LazyDictWithCache when import fails (#32248)
- Remove
executor_class
from Job - fixing backfill for custom executors (#32219) - Fix bugged singleton implementation (#32218)
- Use
mapIndex
to display extra links per mapped task. (#32154) - Ensure that main triggerer thread exits if the async thread fails (#32092)
- Use
re2
for matching untrusted regex (#32060) - Render list items in rendered fields view (#32042)
- Fix hashing of
dag_dependencies
in serialized dag (#32037) - Return
None
if an XComArg fails to resolve in a multiple_outputs Task (#32027) - Check for DAG ID in query param from url as well as kwargs (#32014)
- Flash an error message instead of failure in
rendered-templates
when map index is not found (#32011) - Fix
ExternalTaskSensor
when there is no task group TIs for the current execution date (#32009) - Fix number param html type in trigger template (#31980, #31946)
- Fix masking nested variable fields (#31964)
- Fix
operator_extra_links
property serialization in mapped tasks (#31904) - Decode old-style nested Xcom value (#31866)
- Add a check for trailing slash in webserver base_url (#31833)
- Fix connection uri parsing when the host includes a scheme (#31465)
- Fix database session closing with
xcom_pull
andinlets
(#31128) - Fix DAG's
on_failure_callback
is not invoked when task failed during testing dag. (#30965) - Fix airflow module version check when using
ExternalPythonOperator
and debug logging level (#30367)
Misc/Internal
- Fix
task.sensor
annotation in type stub (#31954) - Limit
Pydantic
to< 2.0.0
until we solve2.0.0
incompatibilities (#32312) - Fix
Pydantic
2 pickiness about model definition (#32307)
Doc only changes
- Add explanation about tag creation and cleanup (#32406)
- Minor updates to docs (#32369, #32315, #32310, #31794)
- Clarify Listener API behavior (#32269)
- Add information for users who ask for requirements (#32262)
- Add links to DAGRun / DAG / Task in Templates Reference (#32245)
- Add comment to warn off a potential wrong fix (#32230)
- Add a note that we'll need to restart triggerer to reflect any trigger change (#32140)
- Adding missing hyperlink to the tutorial documentation (#32105)
- Added difference between Deferrable and Non-Deferrable Operators (#31840)
- Add comments explaining need for special "trigger end" log message (#31812)
- Documentation update on Plugin updates. (#31781)
- Fix SemVer link in security documentation (#32320)
- Update security model of Airflow (#32098)
- Update references to restructured documentation from Airflow core (#32282)
- Separate out advanced logging configuration (#32131)
- Add
™
to Airflow in prominent places (#31977)
Apache Airflow Helm Chart 1.10.0
Significant Changes
Default Airflow image is updated to 2.6.2
(#31979)
The default Airflow image that is used with the Chart is now 2.6.2
, previously it was 2.5.3
.
New Features
- Add support for container security context (#31043)
Improvements
- Validate
executor
andconfig.core.executor
match (#30693) - Support
minAvailable
property for PodDisruptionBudget (#30603) - Add
volumeMounts
to dag processorwaitForMigrations
(#30990) - Template extra volumes (#30773)
Bug Fixes
- Fix webserver probes timeout and period (#30609)
- Add missing
waitForMigrations
for workers (#31625) - Add missing
priorityClassName
to K8S worker pod template (#31328) - Adding log groomer sidecar to dag processor (#30726)
- Do not propagate global security context to statsd and redis (#31865)
Misc
Apache Airflow 2.6.2
Bug Fixes
- Cascade update of TaskInstance to TaskMap table (#31445)
- Fix Kubernetes executors detection of deleted pods (#31274)
- Use keyword parameters for migration methods for mssql (#31309)
- Control permissibility of driver config in extra from airflow.cfg (#31754)
- Fixing broken links in openapi/v1.yaml (#31619)
- Hide old alert box when testing connection with different value (#31606)
- Add TriggererStatus to OpenAPI spec (#31579)
- Resolving issue where Grid won't un-collapse when Details is collapsed (#31561)
- Fix sorting of tags (#31553)
- Add the missing
map_index
to the xcom key when skipping downstream tasks (#31541) - Fix airflow users delete CLI command (#31539)
- Include triggerer health status in Airflow
/health
endpoint (#31529) - Remove dependency already registered for this task warning (#31502)
- Use kube_client over default CoreV1Api for deleting pods (#31477)
- Ensure min backoff in base sensor is at least 1 (#31412)
- Fix
max_active_tis_per_dagrun
for Dynamic Task Mapping (#31406) - Fix error handling when pre-importing modules in DAGs (#31401)
- Fix dropdown default and adjust tutorial to use 42 as default for proof (#31400)
- Fix crash when clearing run with task from normal to mapped (#31352)
- Make BaseJobRunner a generic on the job class (#31287)
- Fix
url_for_asset
fallback and 404 on DAG Audit Log (#31233) - Don't present an undefined execution date (#31196)
- Added spinner activity while the logs load (#31165)
- Include rediss to the list of supported URL schemes (#31028)
- Optimize scheduler by skipping "non-schedulable" DAGs (#30706)
- Save scheduler execution time during search for queued dag_runs (#30699)
- Fix ExternalTaskSensor to work correctly with task groups (#30742)
- Fix DAG.access_control can't sync when clean access_control (#30340)
- Fix failing get_safe_url tests for latest Python 3.8 and 3.9 (#31766)
- Fix typing for POST user endpoint (#31767)
- Fix wrong update for nested group default args (#31776)
- Fix overriding
default_args
in nested task groups (#31608) - Mark
[secrets] backend_kwargs
as a sensitive config (#31788) - Executor events are not always "exited" here (#30859)
- Validate connection IDs (#31140)
Misc/Internal
- Add Python 3.11 support (#27264)
- Replace unicodecsv with standard csv library (#31693)
- Bring back unicodecsv as dependency of Airflow (#31814)
- Remove found_descendents param from get_flat_relative_ids (#31559)
- Fix typing in external task triggers (#31490)
- Wording the next and last run DAG columns better (#31467)
- Skip auto-document things with :meta private: (#31380)
- Add an example for sql_alchemy_connect_args conf (#31332)
- Convert dask upper-binding into exclusion (#31329)
- Upgrade FAB to 4.3.1 (#31203)
- Added metavar and choices to --state flag in airflow dags list-jobs CLI for suggesting valid state arguments. (#31308)
- Use only one line for tmp dir log (#31170)
- Rephrase comment in setup.py (#31312)
- Add fullname to owner on logging (#30185)
- Make connection id validation consistent across interface (#31282)
- Use single source of truth for sensitive config items (#31820)
Doc only changes
- Add docstring and signature for _read_remote_logs (#31623)
- Remove note about triggerer being 3.7+ only (#31483)
- Fix version support information (#31468)
- Add missing BashOperator import to documentation example (#31436)
- Fix task.branch error caused by incorrect initial parameter (#31265)
- Update callbacks documentation (errors and context) (#31116)
- Add an example for dynamic task mapping with non-TaskFlow operator (#29762)
- Few doc fixes - links, grammar and wording (#31719)
- Add description in a few more places about adding airflow to pip install (#31448)
- Fix table formatting in docker build documentation (#31472)
- Update documentation for constraints installation (#31882)
Apache Airflow 2.6.1
Significant Changes
Clarifications of the external Health Check mechanism and using Job
classes (#31277).
In the past SchedulerJob and other *Job
classes are known to have been used to perform
external health checks for Airflow components. Those are, however, Airflow DB ORM related classes.
The DB models and database structure of Airflow are considered as internal implementation detail, following
public interface <https://airflow.apache.org/docs/apache-airflow/stable/public-airflow-interface.html>
_).
Therefore, they should not be used for external health checks. Instead, you should use the
airflow jobs check
CLI command (introduced in Airflow 2.1) for that purpose.
Bug Fixes
- Fix calculation of health check threshold for SchedulerJob (#31277)
- Fix timestamp parse failure for k8s executor pod tailing (#31175)
- Make sure that DAG processor job row has filled value in
job_type
column (#31182) - Fix section name reference for
api_client_retry_configuration
(#31174) - Ensure the KPO runs pod mutation hooks correctly (#31173)
- Remove worrying log message about redaction from the OpenLineage plugin (#31149)
- Move
interleave_timestamp_parser
config to the logging section (#31102) - Ensure that we check worker for served logs if no local or remote logs found (#31101)
- Fix
MappedTaskGroup
import in taskinstance file (#31100) - Format DagBag.dagbag_report() Output (#31095)
- Mask task attribute on task detail view (#31125)
- Fix template error when iterating None value and fix params documentation (#31078)
- Fix
apache-hive
extra so it installs the correct package (#31068) - Fix issue with zip files in DAGs folder when pre-importing Airflow modules (#31061)
- Move TaskInstanceKey to a separate file to fix circular import (#31033, #31204)
- Fix deleting DagRuns and TaskInstances that have a note (#30987)
- Fix
airflow providers get
command output (#30978) - Fix Pool schema in the OpenAPI spec (#30973)
- Add support for dynamic tasks with template fields that contain
pandas.DataFrame
(#30943) - Use the Task Group explicitly passed to 'partial' if any (#30933)
- Fix
order_by
request in list DAG rest api (#30926) - Include node height/width in center-on-task logic (#30924)
- Remove print from dag trigger command (#30921)
- Improve task group UI in new graph (#30918)
- Fix mapped states in grid view (#30916)
- Fix problem with displaying graph (#30765)
- Fix backfill KeyError when try_number out of sync (#30653)
- Re-enable clear and setting state in the TaskInstance UI (#30415)
- Prevent DagRun's
state
andstart_date
from being reset when clearing a task in a running DagRun (#30125)
Misc/Internal
- Upper bind dask until they solve a side effect in their test suite (#31259)
- Show task instances affected by clearing in a table (#30633)
- Fix missing models in API documentation (#31021)
Doc only changes
- Improve description of the
dag_processing.processes
metric (#30891) - Improve Quick Start instructions (#30820)
- Add section about missing task logs to the FAQ (#30717)
- Mount the
config
directory in docker compose (#30662) - Update
version_added
config field formight_contain_dag
andmetrics_allow_list
(#30969)
Apache Airflow 2.6.0
Significant Changes
Default permissions of file task handler log directories and files has been changed to "owner + group" writeable (#29506).
Default setting handles case where impersonation is needed and both users (airflow and the impersonated user)
have the same group set as main group. Previously the default was also other-writeable and the user might choose
to use the other-writeable setting if they wish by configuring file_task_handler_new_folder_permissions
and file_task_handler_new_file_permissions
in logging
section.
SLA callbacks no longer add files to the dag processor manager's queue (#30076)
This stops SLA callbacks from keeping the dag processor manager permanently busy. It means reduced CPU,
and fixes issues where SLAs stop the system from seeing changes to existing dag files. Additional metrics added to help track queue state.
The cleanup()
method in BaseTrigger is now defined as asynchronous (following async/await) pattern (#30152).
This is potentially a breaking change for any custom trigger implementations that override the cleanup()
method and uses synchronous code, however using synchronous operations in cleanup was technically wrong,
because the method was executed in the main loop of the Triggerer and it was introducing unnecessary delays
impacting other triggers. The change is unlikely to affect any existing trigger implementations.
The gauge scheduler.tasks.running
no longer exist (#30374)
The gauge has never been working and its value has always been 0. Having an accurate
value for this metric is complex so it has been decided that removing this gauge makes
more sense than fixing it with no certainty of the correctness of its value.
Consolidate handling of tasks stuck in queued under new task_queued_timeout
config (#30375)
Logic for handling tasks stuck in the queued state has been consolidated, and the all configurations
responsible for timing out stuck queued tasks have been deprecated and merged into
[scheduler] task_queued_timeout
. The configurations that have been deprecated are
[kubernetes] worker_pods_pending_timeout
, [celery] stalled_task_timeout
, and
[celery] task_adoption_timeout
. If any of these configurations are set, the longest timeout will be
respected. For example, if [celery] stalled_task_timeout
is 1200, and [scheduler] task_queued_timeout
is 600, Airflow will set [scheduler] task_queued_timeout
to 1200.
Improvement Changes
Display only the running configuration in configurations view (#28892)
The configurations view now only displays the running configuration. Previously, the default configuration
was displayed at the top but it was not obvious whether this default configuration was overridden or not.
Subsequently, the non-documented endpoint /configuration?raw=true
is deprecated and will be removed in
Airflow 3.0. The HTTP response now returns an additional Deprecation
header. The /config
endpoint on
the REST API is the standard way to fetch Airflow configuration programmatically.
Explicit skipped states list for ExternalTaskSensor (#29933)
ExternalTaskSensor now has an explicit skipped_states
list
Miscellaneous Changes
Handle OverflowError on exponential backoff in next_run_calculation (#28172)
Maximum retry task delay is set to be 24h (86400s) by default. You can change it globally via core.max_task_retry_delay
parameter.
Move Hive macros to the provider (#28538)
The Hive Macros (hive.max_partition
, hive.closest_ds_partition
) are available only when Hive Provider is
installed. Please install Hive Provider > 5.1.0 when using those macros.
New Features
- Skip PythonVirtualenvOperator task when it returns a provided exit code (#30690)
- rename skip_exit_code to skip_on_exit_code and allow providing multiple codes (#30692)
- Add skip_on_exit_code also to ExternalPythonOperator (#30738)
- Add
max_active_tis_per_dagrun
for Dynamic Task Mapping (#29094) - Add serializer for pandas dataframe (#30390)
- Deferrable
TriggerDagRunOperator
(#30292) - Add command to get DAG Details via CLI (#30432)
- Adding ContinuousTimetable and support for @continuous schedule_interval (#29909)
- Allow customized rules to check if a file has dag (#30104)
- Add a new Airflow conf to specify a SSL ca cert for Kubernetes client (#30048)
- Bash sensor has an explicit retry code (#30080)
- Add filter task upstream/downstream to grid view (#29885)
- Add testing a connection via Airflow CLI (#29892)
- Support deleting the local log files when using remote logging (#29772)
Blocklist
to disable specific metric tags or metric names (#29881)- Add a new graph inside of the grid view (#29413)
- Add database
check_migrations
config (#29714) - add output format arg for
cli.dags.trigger
(#29224) - Make json and yaml available in templates (#28930)
- Enable tagged metric names for existing Statsd metric publishing events | influxdb-statsd support (#29093)
- Add arg --yes to
db export-archived
command. (#29485) - Make the policy functions pluggable (#28558)
- Add
airflow db drop-archived
command (#29309) - Enable individual trigger logging (#27758)
- Implement new filtering options in graph view (#29226)
- Add triggers for ExternalTask (#29313)
- Add command to export purged records to CSV files (#29058)
- Add
FileTrigger
(#29265) - Emit DataDog statsd metrics with metadata tags (#28961)
- Add some statsd metrics for dataset (#28907)
- Add --overwrite option to
connections import
CLI command (#28738) - Add general-purpose "notifier" concept to DAGs (#28569)
- Add a new conf to wait past_deps before skipping a task (#27710)
- Add Flink on K8s Operator (#28512)
- Allow Users to disable SwaggerUI via configuration (#28354)
- Show mapped task groups in graph (#28392)
- Log FileTaskHandler to work with KubernetesExecutor's multi_namespace_mode (#28436)
- Add a new config for adapting masked secrets to make it easier to prevent secret leakage in logs (#28239)
- List specific config section and its values using the cli (#28334)
- KubernetesExecutor multi_namespace_mode can use namespace list to avoid requiring cluster role (#28047)
- Automatically save and allow restore of recent DAG run configs (#27805)
- Added exclude_microseconds to cli (#27640)
Improvements
- Rename most pod_id usage to pod_name in KubernetesExecutor (#29147)
- Update the error message for invalid use of poke-only sensors (#30821)
- Update log level in scheduler critical section edge case (#30694)
- AIP-51 Removing Executor Coupling from Core Airflow (
AIP-51 <https://github.com/apache/airflow/pulls?q=is%3Apr+is%3Amerged+label%3AAIP-51+milestone%3A%22Airflow+2.6.0%22>
_) - Add multiple exit code handling in skip logic for BashOperator (#30739)
- Updated app to support configuring the caching hash method for FIPS v2 (#30675)
- Preload airflow imports before dag parsing to save time (#30495)
- Improve task & run actions
UX
in grid view (#30373) - Speed up TaskGroups with caching property of group_id (#30284)
- Use the engine provided in the session (#29804)
- Type related import optimization for Executors (#30361)
- Add more type hints to the code base (#30503)
- Always use self.appbuilder.get_session in security managers (#30233)
- Update SQLAlchemy
select()
to new style (#30515) - Refactor out xcom constants from models (#30180)
- Add exception class name to DAG-parsing error message (#30105)
- Rename statsd_allow_list and statsd_block_list to
metrics_*_list
(#30174) - Improve serialization of tuples and sets (#29019)
- Make cleanup method in trigger an async one (#30152)
- Lazy load serialization modules (#30094)
- SLA callbacks no longer add files to the dag_processing manager queue (#30076)
- Add task.trigger rule to grid_data (#30130)
- Speed up log template sync by avoiding ORM (#30119)
- Separate cli_parser.py into two modules (#29962)
- Explicit skipped states list for ExternalTaskSensor (#29933)
- Add task state hover highlighting to new graph (#30100)
- Store grid tabs in url params (#29904)
- Use custom Connexion resolver to load lazily (#29992)
- Delay Kubernetes import in secret masker (#29993)
- Delay ConnectionModelView init until it's accessed (#29946)
- Scheduler, make stale DAG deactivation threshold configurable instead of using dag processing timeout (#29446)
- Improve grid view height calculations (#29563)
- Avoid importing executor during conf validation (#29569)
- Make permissions for FileTaskHandler group-writeable and configurable (#29506)
- Add colors in help outputs of Airflow CLI commands #28789 (#29116)
- Add a param for get_dags endpoint to list only unpaused dags (#28713)
- Expose updated_at filter for dag run and task instance endpoints (#28636)
- Increase length of user identifier columns (#29061)
- Update gantt chart UI to display queued state of tasks (#28686)
- Add index on log.dttm (#28944)
- Display only the running configuration in configurations view (#28892)
- Cap dropdown menu size dynamically (#28736)
- Added JSON linter to connection edit / add UI for field extra. On connection edit screen, existing extra data will be displayed indented (#28583)
- Use labels instead of pod name for pod log read in k8s exec (#28546)
- Use time not tries for queued & running re-checks. (#28586)
- CustomTTYColoredFormatter should inherit TimezoneAware formatter (#28439)
- Improve past depends handling in Airflow CLI tasks.run command (#28113)
- Support using a list of callbacks in
on_*_callback/sla_miss_callbacks
(#28469) - Better table name validation for db clean (#28246)
- Use object instead of array in config.yml for config template (#28417)
- Add markdown rendering for task notes. (#28245)
- Show mapped task groups in grid view (#28208)
- Add
renamed
and ``pre...
Apache Airflow Helm Chart 1.9.0
Significant Changes
Default PgBouncer and PgBouncer Exporter images have been updated (#29919)
The PgBouncer and PgBouncer Exporter images are based on newer software/os. They are also multi-platform AMD/ARM images:
pgbouncer
: 1.16.1 based on alpine 3.14 (airflow-pgbouncer-2023.02.24-1.16.1
)pgbouncer-exporter
: 0.14.0 based on alpine 3.17 (apache/airflow:airflow-pgbouncer-exporter-2023.02.21-0.14.0
)
Default Airflow image is updated to 2.5.3
(#30411)
The default Airflow image that is used with the Chart is now 2.5.3
, previously it was 2.5.1
.
New Features
- Add support for
hostAliases
for Airflow webserver and scheduler (#30051) - Add support for annotations on StatsD Deployment and cleanup CronJob (#30126)
- Add support for annotations in logs PVC (#29270)
- Add support for annotations in extra ConfigMap and Secrets (#30303)
- Add support for pod annotations to PgBouncer (#30168)
- Add support for
ttlSecondsAfterFinished
onmigrateDatabaseJob
andcreateUserJob
(#29314) - Add support for using SHA digest of Docker images (#30214)
Improvements
- Template extra volumes in Helm Chart (#29357)
- Make Liveness/Readiness Probe timeouts configurable for PgBouncer Exporter (#29752)
- Enable individual trigger logging (#29482)
Bug Fixes
- Add
config.kubernetes_executor
to values (#29818) - Block extra properties in image config (#30217)
- Remove replicas if KEDA is enabled (#29838)
- Mount
kerberos.keytab
to worker when enabled (#29526) - Fix adding annotations for dag persistence PVC (#29622)
- Fix
bitnami/postgresql
default username and password (#29478) - Add global volumes in pod template file (#29295)
- Add log groomer sidecar to triggerer service (#29392)
- Helm deployment fails when
postgresql.nameOverride
is used (#29214)
Doc only changes
- Add gitSync optional env description (#29378)
- Add webserver NodePort example (#29460)
- Include Rancher in Helm chart install instructions (#28416)
- Change RSA SSH host key to reflect update from Github (#30286)
Misc
- Update Airflow version to 2.5.3 (#30411)
- Switch to newer versions of PgBouncer and PgBouncer Exporter in chart (#29919)
- Reformat chart templates (#29917)
- Reformat chart templates part 2 (#29941)
- Reformat chart templates part 3 (#30312)
- Replace deprecated k8s registry references (#29938)
- Fix
airflow_dags_mount
formatting (#29296) - Fix
webserver.service.ports
formatting (#29297)