Releases: apache/airflow
Apache Airflow 2.1.1
Bug Fixes
- Don't crash attempting to mask secrets in dict with non-string keys (#16601)
- Always install sphinx_airflow_theme from
PyPI
(#16594) - Remove limitation for elasticsearch library (#16553)
- Adding extra requirements for build and runtime of the PROD image. (#16170)
- Cattrs 1.7.0 released by the end of May 2021 break lineage usage (#16173)
- Removes unnecessary packages from setup_requires (#16139)
- Pins docutils to <0.17 until breaking behaviour is fixed (#16133)
- Improvements for Docker Image docs (#14843)
- Ensure that
dag_run.conf
is a dict (#15057) - Fix CLI connections import and migrate logic from secrets to Connection model (#15425)
- Fix Dag Details start date bug (#16206)
- Fix DAG run state not updated while DAG is paused (#16343)
- Allow null value for operator field in task_instance schema(REST API) (#16516)
- Avoid recursion going too deep when redacting logs (#16491)
- Backfill: Don't create a DagRun if no tasks match task regex (#16461)
- Tree View UI for larger DAGs & more consistent spacing in Tree View (#16522)
- Correctly handle None returns from Query.scalar() (#16345)
- Adding
only_active
parameter to /dags endpoint (#14306) - Don't show stale Serialized DAGs if they are deleted in DB (#16368)
- Make REST API List DAGs endpoint consistent with UI/CLI behaviour (#16318)
- Support remote logging in elasticsearch with
filebeat 7
(#14625) - Queue tasks with higher priority and earlier execution_date first. (#15210)
- Make task ID on legend have enough width and width of line chart to be 100%. (#15915)
- Fix normalize-url vulnerability (#16375)
- Validate retries value on init for better errors (#16415)
- add num_runs query param for tree refresh (#16437)
- Fix templated default/example values in config ref docs (#16442)
- Add
passphrase
andprivate_key
to default sensitive field names (#16392) - Fix tasks in an infinite slots pool were never scheduled (#15247)
- Fix Orphaned tasks stuck in CeleryExecutor as running (#16550)
- Don't fail to log if we can't redact something (#16118)
- Set max tree width to 1200 pixels (#16067)
- Fill the "job_id" field for
airflow task run
without--local
/--raw
for KubeExecutor (#16108) - Fixes problem where conf variable was used before initialization (#16088)
- Fix apply defaults for task decorator (#16085)
- Parse recently modified files even if just parsed (#16075)
- Ensure that we don't try to mask empty string in logs (#16057)
- Don't die when masking
log.exception
when there is no exception (#16047) - Restores apply_defaults import in base_sensor_operator (#16040)
- Fix auto-refresh in tree view When webserver ui is not in
/
(#16018) - Fix dag.clear() to set multiple dags to running when necessary (#15382)
- Fix Celery executor getting stuck randomly because of reset_signals in multiprocessing (#15989)
Apache Airflow Upgrade Check 1.4.0
- Add
conf
not importable from airflow rule (#14400) - Upgrade rule to suggest rename
[scheduler] max_threads
to[scheduler] parsing_processes
(#14913) - Fix running "upgrade_check" command in a PTY. (#14977)
- Skip
DatabaseVersionCheckRule
check if invalid version is detected (#15122) - Fix too specific parsing of
False
inLegacyUIDeprecated
(#14967) - Fix false positives when inheriting classes that inherit
DbApiHook
(#16543)
Apache Airflow Helm Chart 1.0.0
This is the first release of the Official Helm Chart.
Apache Airflow 2.1.0rc2
- Add ``PythonVirtualenvDecorator`` to Taskflow API (#14761) - Add ``Taskgroup`` decorator (#15034) - Create a DAG Calendar View (#15423) - Create cross-DAG dependencies view (#13199) - Add rest API to query for providers (#13394) - Mask passwords and sensitive info in task logs and UI (#15599) - Add ``SubprocessHook`` for running commands from operators (#13423) - Add DAG Timeout in UI page "DAG Details" (#14165) - Add ``WeekDayBranchOperator`` (#13997) - Add JSON linter to DAG Trigger UI (#13551) - Add DAG Description Doc to Trigger UI Page (#13365) - Add airflow webserver URL into SLA miss email. (#13249) - Add read only REST API endpoints for users (#14735) - Add files to generate Airflow's Python SDK (#14739) - Add dynamic fields to snowflake connection (#14724) - Add read only REST API endpoint for roles and permissions (#14664) - Add new datetime branch operator (#11964) - Add Google leveldb hook and operator (#13109) (#14105) - Add plugins endpoint to the REST API (#14280) - Add ``worker_pod_pending_timeout`` support (#15263) - Add support for labeling DAG edges (#15142) - Add CUD REST API endpoints for Roles (#14840) - Import connections from a file (#15177) - A bunch of ``template_fields_renderers`` additions (#15130) - Add REST API query sort and order to some endpoints (#14895) - Add timezone context in new ui (#15096) - Add query mutations to new UI (#15068) - Add different modes to sort dag files for parsing (#15046) - Auto refresh on Tree View (#15474) - BashOperator to raise ``AirflowSkipException`` on exit code 99 (by default, configurable) (#13421) (#14963) - Clear tasks by task ids in REST API (#14500) - Support jinja2 native Python types (#14603) - Allow celery workers without gossip or mingle modes (#13880) - Add ``airflow jobs check`` CLI command to check health of jobs (Scheduler etc) (#14519) - Rename ``DateTimeBranchOperator`` to ``BranchDateTimeOperator`` (#14720) - Add optional result handler callback to ``DbApiHook`` (#15581) - Update Flask App Builder limit to recently released 3.3 (#15792) - Prevent creating flask sessions on REST API requests (#15295) - Sync DAG specific permissions when parsing (#15311) - Increase maximum length of pool name on Tasks to 256 characters (#15203) - Enforce READ COMMITTED isolation when using mysql (#15714) - Auto-apply ``apply_default`` to subclasses of ``BaseOperator`` (#15667) - Emit error on duplicated DAG ID (#15302) - Update ``KubernetesExecutor`` pod templates to allow access to IAM permissions (#15669) - More verbose logs when running ``airflow db check-migrations`` (#15662) - When one_success mark task as failed if no success (#15467) - Add an option to trigger a dag w/o changing conf (#15591) - Add Airflow UI instance_name configuration option (#10162) - Add a decorator to retry functions with DB transactions (#14109) - Add return to PythonVirtualenvOperator's execute method (#14061) - Add verify_ssl config for kubernetes (#13516) - Add description about ``secret_key`` when Webserver > 1 (#15546) - Add Traceback in LogRecord in ``JSONFormatter`` (#15414) - Add support for arbitrary json in conn uri format (#15100) - Adds description field in variable (#12413) (#15194) - Add logs to show last modified in SFTP, FTP and Filesystem sensor (#15134) - Execute ``on_failure_callback`` when SIGTERM is received (#15172) - Allow hiding of all edges when highlighting states (#15281) - Display explicit error in case UID has no actual username (#15212) - Serve logs with Scheduler when using Local or Sequential Executor (#15557) - Deactivate trigger, refresh, and delete controls on dag detail view. (#14144) - Turn off autocomplete for connection forms (#15073) - Increase default ``worker_refresh_interval`` to ``6000`` seconds (#14970) - Only show User's local timezone if it's not UTC (#13904) - Suppress LOG/WARNING for a few tasks CLI for better CLI experience (#14567) - Configurable API response (CORS) headers (#13620) - Allow viewers to see all docs links (#14197) - Update Tree View date ticks (#14141) - Make the tooltip to Pause / Unpause a DAG clearer (#13642) - Warn about precedence of env var when getting variables (#13501) - Move ``[celery] default_queue`` config to ``[operators] default_queue`` to re-use between executors (#14699) - Fix 500 error from ``updateTaskInstancesState`` API endpoint when ``dry_run`` not passed (#15889) - Ensure that task preceding a PythonVirtualenvOperator doesn't fail (#15822) - Prevent mixed case env vars from crashing processes like worker (#14380) - Fixed type annotations in DAG decorator (#15778) - Fix on_failure_callback when task receive SIGKILL (#15537) - Fix dags table overflow (#15660) - Fix changing the parent dag state on subdag clear (#15562) - Fix reading from zip package to default to text (#13962) - Fix wrong parameter for ``drawDagStatsForDag`` in dags.html (#13884) - Fix QueuedLocalWorker crashing with EOFError (#13215) - Fix typo in ``NotPreviouslySkippedDep`` (#13933) - Fix parallelism after KubeExecutor pod adoption (#15555) - Fix kube client on mac with keepalive enabled (#15551) - Fixes wrong limit for dask for python>3.7 (should be <3.7) (#15545) - Fix Task Adoption in ``KubernetesExecutor`` (#14795) - Fix timeout when using XCom with ``KubernetesPodOperator`` (#15388) - Fix deprecated provider aliases in "extras" not working (#15465) - Fixed default XCom deserialization. (#14827) - Fix used_group_ids in ``dag.partial_subset`` (#13700) (#15308) - Further fix trimmed ``pod_id`` for ``KubernetesPodOperator`` (#15445) - Bugfix: Invalid name when trimmed `pod_id` ends with hyphen in ``KubernetesPodOperator`` (#15443) - Fix incorrect slots stats when TI ``pool_slots > 1`` (#15426) - Fix DAG last run link (#15327) - Fix ``sync-perm`` to work correctly when update_fab_perms = False (#14847) - Fixes limits on Arrow for plexus test (#14781) - Fix UI bugs in tree view (#14566) - Fix AzureDataFactoryHook failing to instantiate its connection (#14565) - Fix permission error on non-POSIX filesystem (#13121) - Fix spelling in "ignorable" (#14348) - Fix get_context_data doctest import (#14288) - Correct typo in ``GCSObjectsWtihPrefixExistenceSensor`` (#14179) - Fix order of failed deps (#14036) - Fix critical ``CeleryKubernetesExecutor`` bug (#13247) - Fix four bugs in ``StackdriverTaskHandler`` (#13784) - ``func.sum`` may return ``Decimal`` that break rest APIs (#15585) - Persist tags params in pagination (#15411) - API: Raise ``AlreadyExists`` exception when the ``execution_date`` is same (#15174) - Remove duplicate call to ``sync_metadata`` inside ``DagFileProcessorManager`` (#15121) - Extra ``docker-py`` update to resolve docker op issues (#15731) - Ensure executors end method is called (#14085) - Remove ``user_id`` from API schema (#15117) - Prevent clickable bad links on disabled pagination (#15074) - Acquire lock on db for the time of migration (#10151) - Skip SLA check only if SLA is None (#14064) - Print right version in airflow info command (#14560) - Make ``airflow info`` work with pipes (#14528) - Rework client-side script for connection form. (#14052) - API: Add ``CollectionInfo`` in all Collections that have ``total_entries`` (#14366) - Fix ``task_instance_mutation_hook`` when importing airflow.models.dagrun (#15851) - Fix docstring of SqlSensor (#15466) - Small changes on "DAGs and Tasks documentation" (#14853) - Add note on changes to configuration options (#15696) - Add docs to the ``markdownlint`` and ``yamllint`` config files (#15682) - Rename old "Experimental" API to deprecated in the docs. (#15653) - Fix documentation error in `git_sync_template.yaml` (#13197) - Fix doc link permission name (#14972) - Fix link to Helm chart docs (#14652) - Fix docstrings for Kubernetes code (#14605) - docs: Capitalize & minor fixes (#14283) (#14534) - Fixed reading from zip package to default to text. (#13984) - An initial rework of the "Concepts" docs (#15444) - Improve docstrings for various modules (#15047) - Add documentation on database connection URI (#14124) - Add Helm Chart logo to docs index (#14762) - Create a new documentation package for Helm Chart (#14643) - Add docs about supported logging levels (#14507) - Update docs about tableau and salesforce provider (#14495) - Replace deprecated doc links to the correct one (#14429) - Refactor redundant doc url logic to use utility (#14080) - docs: NOTICE: Updated 2016-2019 to 2016-now (#14248) - Skip DAG perm sync during parsing if possible (#15464) - Add picture and examples for Edge Labels (#15310) - Add example DAG & how-to guide for sqlite (#13196) - Add links to new modules for deprecated modules (#15316) - Add note in Updating.md about FAB data model change (#14478) - Fix ``logging.exception`` redundancy (#14823) - Bump ``stylelint`` to remove vulnerable sub-dependency (#15784) - Add resolution to force dependencies to use patched version of lodash (#15777) - Update croniter to 1.0.x series (#15769) - Get rid of Airflow 1.10 in Breeze (#15712) - Run helm chart tests in parallel (#15706) - Bump ``ssri`` from 6.0.1 to 6.0.2 in /airflow/www (#15437) - Remove the limit on Gunicorn dependency (#15611) - Better "dependency already registered" warning message for tasks #14613 (#14860) - Pin pandas-gbq to <0.15.0 (#15114) - Use Pip 21.* to install airflow officially (#15513) - Bump mysqlclient to support the 1.4.x and 2.x series (#14978) - Finish refactor of DAG resource name helper (#15511) - Refactor/Cleanup Presentation of Graph Task and Path Highlighting (#15257) - Standardize default fab perms (#14946) - Remove ``datepicker`` for task instance detail view (#15284) - Turn provider's import warnings into debug logs (#14903) - Remove left-over fields from required in provider_info schema. (#14119) - Deprecate ``tableau`` extra (#13595) - Use built-in `cached_property` on Python 3.8 where possible (#14606) - Clean-up JS code in UI templates (#14019) - Bump elliptic from 6.5.3 to 6.5.4 in /airflow/www (#14668) - Switch to f-strings using ``flynt``. (#13732) - use ``jquery`` ready instead of vanilla js (#15258) - Migrate task instance log (ti_log) js (#15309) - Migrate graph js (#15307) - Migrate dags.html javascript (#14692) - Removes unnecessary AzureContainerInstance connection type (#15514) - Separate Kubernetes pod_launcher from core airflow (#15165) - update remaining old import paths of operators (#15127) - Remove broken and undocumented "demo mode" feature (#14601) - Simplify configuration/legibility of ``Webpack`` entries (#14551) - remove inline tree js (#14552) - Js linting and inline migration for simple scripts (#14215) - Remove use of repeated constant in AirflowConfigParser (#14023) - Deprecate email credentials from environment variables. (#13601) - Remove unused 'context' variable in task_instance.py (#14049) - Disable suppress_logs_and_warning in cli when debugging (#13180)
Apache Airflow 2.1.0rc1
New Features
""""""""""""
- Add
PythonVirtualenvDecorator
to Taskflow API (#14761) - Add
Taskgroup
decorator (#15034) - Create a DAG Calendar View (#15423)
- Create cross-DAG dependencies view (#13199)
- Add rest API to query for providers (#13394)
- Mask passwords and sensitive info in task logs and UI (#15599)
- Add
SubprocessHook
for running commands from operators (#13423) - Add DAG Timeout in UI page "DAG Details" (#14165)
- Add
WeekDayBranchOperator
(#13997) - Add JSON linter to DAG Trigger UI (#13551)
- Add DAG Description Doc to Trigger UI Page (#13365)
- Add airflow webserver URL into SLA miss email. (#13249)
- Add read only REST API endpoints for users (#14735)
- Add files to generate Airflow's Python SDK (#14739)
- Add dynamic fields to snowflake connection (#14724)
- Add read only REST API endpoint for roles and permissions (#14664)
- Add new datetime branch operator (#11964)
- Add Google leveldb hook and operator (#13109) (#14105)
- Add plugins endpoint to the REST API (#14280)
- Add
worker_pod_pending_timeout
support (#15263) - Add support for labeling DAG edges (#15142)
- Add CUD REST API endpoints for Roles (#14840)
- Import connections from a file (#15177)
- A bunch of
template_fields_renderers
additions (#15130) - Add REST API query sort and order to some endpoints (#14895)
- Add timezone context in new ui (#15096)
- Add query mutations to new UI (#15068)
- Add different modes to sort dag files for parsing (#15046)
- Auto refresh on Tree View (#15474)
- BashOperator to raise
AirflowSkipException
on exit code 99 (by default, configurable) (#13421) (#14963) - Clear tasks by task ids in REST API (#14500)
- Support jinja2 native Python types (#14603)
- Allow celery workers without gossip or mingle modes (#13880)
- Add
airflow jobs check
CLI command to check health of jobs (Scheduler etc) (#14519) - Rename
DateTimeBranchOperator
toBranchDateTimeOperator
(#14720)
Improvements
""""""""""""
- Add optional result handler callback to
DbApiHook
(#15581) - Update Flask App Builder limit to recently released 3.3 (#15792)
- Prevent creating flask sessions on REST API requests (#15295)
- Sync DAG specific permissions when parsing (#15311)
- Increase maximum length of pool name on Tasks to 256 characters (#15203)
- Enforce READ COMMITTED isolation when using mysql (#15714)
- Auto-apply
apply_default
to subclasses ofBaseOperator
(#15667) - Emit error on duplicated DAG ID (#15302)
- Update
KubernetesExecutor
pod templates to allow access to IAM permissions (#15669) - More verbose logs when running
airflow db check-migrations
(#15662) - When one_success mark task as failed if no success (#15467)
- Add an option to trigger a dag w/o changing conf (#15591)
- Add Airflow UI instance_name configuration option (#10162)
- Add a decorator to retry functions with DB transactions (#14109)
- Add return to PythonVirtualenvOperator's execute method (#14061)
- Add verify_ssl config for kubernetes (#13516)
- Add description about
secret_key
when Webserver > 1 (#15546) - Add Traceback in LogRecord in
JSONFormatter
(#15414) - Add support for arbitrary json in conn uri format (#15100)
- Adds description field in variable (#12413) (#15194)
- Add logs to show last modified in SFTP, FTP and Filesystem sensor (#15134)
- Execute
on_failure_callback
when SIGTERM is received (#15172) - Allow hiding of all edges when highlighting states (#15281)
- Display explicit error in case UID has no actual username (#15212)
- Serve logs with Scheduler when using Local or Sequential Executor (#15557)
- Deactivate trigger, refresh, and delete controls on dag detail view. (#14144)
- Turn off autocomplete for connection forms (#15073)
- Increase default
worker_refresh_interval
to6000
seconds (#14970) - Only show User's local timezone if it's not UTC (#13904)
- Suppress LOG/WARNING for a few tasks CLI for better CLI experience (#14567)
- Configurable API response (CORS) headers (#13620)
- Allow viewers to see all docs links (#14197)
- Update Tree View date ticks (#14141)
- Make the tooltip to Pause / Unpause a DAG clearer (#13642)
- Warn about precedence of env var when getting variables (#13501)
- Move
[celery] default_queue
config to[operators] default_queue
to re-use between executors (#14699)
Bug Fixes
"""""""""
- Fix 500 error from
updateTaskInstancesState
API endpoint whendry_run
not passed (#15889) - Ensure that task preceding a PythonVirtualenvOperator doesn't fail (#15822)
- Prevent mixed case env vars from crashing processes like worker (#14380)
- Fixed type annotations in DAG decorator (#15778)
- Fix on_failure_callback when task receive SIGKILL (#15537)
- Fix dags table overflow (#15660)
- Fix changing the parent dag state on subdag clear (#15562)
- Fix reading from zip package to default to text (#13962)
- Fix wrong parameter for
drawDagStatsForDag
in dags.html (#13884) - Fix QueuedLocalWorker crashing with EOFError (#13215)
- Fix typo in
NotPreviouslySkippedDep
(#13933) - Fix parallelism after KubeExecutor pod adoption (#15555)
- Fix kube client on mac with keepalive enabled (#15551)
- Fixes wrong limit for dask for python>3.7 (should be <3.7) (#15545)
- Fix Task Adoption in
KubernetesExecutor
(#14795) - Fix timeout when using XCom with
KubernetesPodOperator
(#15388) - Fix deprecated provider aliases in "extras" not working (#15465)
- Fixed default XCom deserialization. (#14827)
- Fix used_group_ids in
dag.partial_subset
(#13700) (#15308) - Further fix trimmed
pod_id
forKubernetesPodOperator
(#15445) - Bugfix: Invalid name when trimmed
pod_id
ends with hyphen inKubernetesPodOperator
(#15443) - Fix incorrect slots stats when TI
pool_slots > 1
(#15426) - Fix DAG last run link (#15327)
- Fix
sync-perm
to work correctly when update_fab_perms = False (#14847) - Fixes limits on Arrow for plexus test (#14781)
- Fix UI bugs in tree view (#14566)
- Fix AzureDataFactoryHook failing to instantiate its connection (#14565)
- Fix permission error on non-POSIX filesystem (#13121)
- Fix spelling in "ignorable" (#14348)
- Fix get_context_data doctest import (#14288)
- Correct typo in
GCSObjectsWtihPrefixExistenceSensor
(#14179) - Fix order of failed deps (#14036)
- Fix critical
CeleryKubernetesExecutor
bug (#13247) - Fix four bugs in
StackdriverTaskHandler
(#13784) func.sum
may returnDecimal
that break rest APIs (#15585)- Persist tags params in pagination (#15411)
- API: Raise
AlreadyExists
exception when theexecution_date
is same (#15174) - Remove duplicate call to
sync_metadata
insideDagFileProcessorManager
(#15121) - Extra
docker-py
update to resolve docker op issues (#15731) - Ensure executors end method is called (#14085)
- Remove
user_id
from API schema (#15117) - Prevent clickable bad links on disabled pagination (#15074)
- Acquire lock on db for the time of migration (#10151)
- Skip SLA check only if SLA is None (#14064)
- Print right version in airflow info command (#14560)
- Make
airflow info
work with pipes (#14528) - Rework client-side script for connection form. (#14052)
- API: Add
CollectionInfo
in all Collections that havetotal_entries
(#14366) - Fix
task_instance_mutation_hook
when importing airflow.models.dagrun (#15851)
Doc only changes
""""""""""""""""
- Fix docstring of SqlSensor (#15466)
- Small changes on "DAGs and Tasks documentation" (#14853)
- Add note on changes to configuration options (#15696)
- Add docs to the
markdownlint
andyamllint
config files (#15682) - Rename old "Experimental" API to deprecated in the docs. (#15653)
- Fix documentation error in
git_sync_template.yaml
(#13197) - Fix doc link permission name (#14972)
- Fix link to Helm chart docs (#14652)
- Fix docstrings for Kubernetes code (#14605)
- docs: Capitalize & minor fixes (#14283) (#14534)
- Fixed reading from zip package to default to text. (#13984)
- An initial rework of the "Concepts" docs (#15444)
- Improve docstrings for various modules (#15047)
- Add documentation on database connection URI (#14124)
- Add Helm Chart logo to docs index (#14762)
- Create a new documentation package for Helm Chart (#14643)
- Add docs about supported logging levels (#14507)
- Update docs about tableau and salesforce provider (#14495)
- Replace deprecated doc links to the correct one (#14429)
- Refactor redundant doc url logic to use utility (#14080)
- docs: NOTICE: Updated 2016-2019 to 2016-now (#14248)
- Skip DAG perm sync during parsing if possible (#15464)
- Add picture and examples for Edge Labels (#15310)
- Add example DAG & how-to guide for sqlite (#13196)
- Add links to new modules for deprecated modules (#15316)
- Add note in Updating.md about FAB data model change (#14478)
Misc/Internal
"""""""""""""
- Fix
logging.exception
redundancy (#14823) - Bump
stylelint
to remove vulnerable sub-dependency (#15784) - Add resolution to force dependencies to use patched version of lodash (#15777)
- Update croniter to 1.0.x series (#15769)
- Get rid of Airflow 1.10 in Breeze (#15712)
- Run helm chart tests in parallel (#15706)
- Bump
ssri
from 6.0.1 to 6.0.2 in /airflow/www (#15437) - Remove the limit on Gunicorn dependency (#15611)
- Better "dependency already registered" warning message for tasks #14613 (#14860)
- Pin pandas-gbq to <0.15.0 (#15114)
- Use Pip 21.* to install airflow officially (#15513)
- Bump mysqlclient to support the 1.4.x and 2.x series (#14978)
- Finish refactor of DAG resource name helper (#15511)
- Refactor/Cleanup Presentation of Graph Task and Path Highlighting (#15257)
- Standardize default fab perms (#14946)
- Remove
datepicker
for task instance detail view (#15284) - Turn provider's import warnings into debug logs (#14903)
...
Apache Airflow v2.0.2
Bug Fixes
- Bugfix:
TypeError
when Serializing & sorting iterable properties of DAGs (#15395) - Fix missing
on_load
trigger for folder-based plugins (#15208) kubernetes cleanup-pods
subcommand will only clean up Airflow-created Pods (#15204)- Fix password masking in CLI action_logging (#15143)
- Fix url generation for TriggerDagRunOperatorLink (#14990)
- Restore base lineage backend (#14146)
- Unable to trigger backfill or manual jobs with Kubernetes executor. (#14160)
- Bugfix: Task docs are not shown in the Task Instance Detail View (#15191)
- Bugfix: Fix overriding
pod_template_file
in KubernetesExecutor (#15197) - Bugfix: resources in
executor_config
breaks Graph View in UI (#15199) - Fix celery executor bug trying to call len on map (#14883)
- Fix bug in airflow.stats timing that broke dogstatsd mode (#15132)
- Avoid scheduler/parser manager deadlock by using non-blocking IO (#15112)
- Re-introduce
dagrun.schedule_delay
metric (#15105) - Compare string values, not if strings are the same object in Kube executor(#14942)
- Pass queue to BaseExecutor.execute_async like in airflow 1.10 (#14861)
- Scheduler: Remove TIs from starved pools from the critical path. (#14476)
- Remove extra/needless deprecation warnings from airflow.contrib module (#15065)
- Fix support for long dag_id and task_id in KubernetesExecutor (#14703)
- Sort lists, sets and tuples in Serialized DAGs (#14909)
- Simplify cleaning string passed to origin param (#14738) (#14905)
- Fix error when running tasks with Sentry integration enabled. (#13929)
- Webserver: Sanitize string passed to origin param (#14738)
- Fix losing duration < 1 secs in tree (#13537)
- Pin SQLAlchemy to <1.4 due to breakage of sqlalchemy-utils (#14812)
- Fix KubernetesExecutor issue with deleted pending pods (#14810)
- Default to Celery Task model when backend model does not exist (#14612)
- Bugfix: Plugins endpoint was unauthenticated (#14570)
- BugFix: fix DAG doc display (especially for TaskFlow DAGs) (#14564)
- BugFix: TypeError in airflow.kubernetes.pod_launcher's monitor_pod (#14513)
- Bugfix: Fix wrong output of tags and owners in dag detail API endpoint (#14490)
- Fix logging error with task error when JSON logging is enabled (#14456)
- Fix statsd metrics not sending when using daemon mode (#14454)
- Gracefully handle missing start_date and end_date for DagRun (#14452)
- BugFix: Serialize max_retry_delay as a timedelta (#14436)
- Fix crash when user clicks on "Task Instance Details" caused by start_date being None (#14416)
- BugFix: Fix TaskInstance API call fails if a task is removed from running DAG (#14381)
- Scheduler should not fail when invalid
executor_config
is passed (#14323) - Fix bug allowing task instances to survive when dagrun_timeout is exceeded (#14321)
- Fix bug where DAG timezone was not always shown correctly in UI tooltips (#14204)
- Use
Lax
forcookie_samesite
when empty string is passed (#14183) - [AIRFLOW-6076] fix
dag.cli()
KeyError (#13647) - Fix running child tasks in a subdag after clearing a successful subdag (#14776)
Improvements
- Remove unused JS packages causing false security alerts (#15383)
- Change default of
[kubernetes] enable_tcp_keepalive
for new installs toTrue
(#15338) - Fixed #14270: Add error message in OOM situations (#15207)
- Better compatibility/diagnostics for arbitrary UID in docker image (#15162)
- Updates 3.6 limits for latest versions of a few libraries (#15209)
- Adds Blinker dependency which is missing after recent changes (#15182)
- Remove 'conf' from search_columns in DagRun View (#15099)
- More proper default value for namespace in K8S cleanup-pods CLI (#15060)
- Faster default role syncing during webserver start (#15017)
- Speed up webserver start when there are many DAGs (#14993)
- Much easier to use and better documented Docker image (#14911)
- Use
libyaml
C library when available. (#14577) - Don't create unittest.cfg when not running in unit test mode (#14420)
- Webserver: Allow Filtering TaskInstances by queued_dttm (#14708)
- Update Flask-AppBuilder dependency to allow 3.2 (and all 3.x series) (#14665)
- Remember expanded task groups in browser local storage (#14661)
- Add plain format output to cli tables (#14546)
- Make
airflow dags show
command display TaskGroups (#14269) - Increase maximum size of
extra
connection field. (#12944) - Speed up clear_task_instances by doing a single sql delete for TaskReschedule (#14048)
- Add more flexibility with FAB menu links (#13903)
- Add better description and guidance in case of sqlite version mismatch (#14209)
Doc only changes
- Add documentation create/update community providers (#15061)
- Fix mistake and typos in airflow.utils.timezone docstrings (#15180)
- Replace new url for Stable Airflow Docs (#15169)
- Docs: Clarify behavior of delete_worker_pods_on_failure (#14958)
- Create a documentation package for Docker image (#14846)
- Multiple minor doc (OpenAPI) fixes (#14917)
- Replace Graph View Screenshot to show Auto-refresh (#14571)
Misc/Internal
Apache Airflow 1.10.15, 2021-03-17
Bug Fixes
- Fix
airflow db upgrade
to upgrade db as intended (#13267) - Moved boto3 limitation to snowflake (#13286)
KubernetesExecutor
should accept images fromexecutor_config
(#13074)- Scheduler should acknowledge active runs properly (#13803)
- Bugfix: Unable to import Airflow plugins on Python 3.8 (#12859)
- Include
airflow/contrib/executors
in the dist package - Pin Click version for Python 2.7 users
- Ensure all statsd timers use millisecond values. (#10633)
- [
kubernetes_generate_dag_yaml
] - Fix dag yaml generate function (#13816) - Fix
airflow tasks clear
cli command wirh--yes
(#14188) - Fix permission error on non-POSIX filesystem (#13121) (#14383)
- Fixed deprecation message for "variables" command (#14457)
- BugFix: fix the
delete_dag
function of json_client (#14441) - Fix merging of secrets and configmaps for
KubernetesExecutor
(#14090) - Fix webserver exiting when gunicorn master crashes (#13470)
- Bump ini from 1.3.5 to 1.3.8 in
airflow/www_rbac
- Bump datatables.net from 1.10.21 to 1.10.23 in
airflow/www_rbac
- Webserver: Sanitize string passed to origin param (#14738)
- Make
rbac_app
'sdb.session
use the same timezone with@provide_session
(#14025)
Improvements
- Adds airflow as viable docker command in official image (#12878)
StreamLogWriter
: Provide (no-op) close method (#10885)- Add 'airflow variables list' command for 1.10.x transition version (#14462)
Doc only changes
Airflow 2.0.1, 2021-02-08
Bug Fixes
- Bugfix: Return XCom Value in the XCom Endpoint API (#13684)
- Bugfix: Import error when using custom backend and
sql_alchemy_conn_secret
(#13260) - Allow PID file path to be relative when daemonize a process (scheduler, kerberos, etc) (#13232)
- Bugfix: no generic
DROP CONSTRAINT
in MySQL duringairflow db upgrade
(#13239) - Bugfix: Sync Access Control defined in DAGs when running
sync-perm
(#13377) - Stop sending Callback Requests if no callbacks are defined on DAG (#13163)
- BugFix: Dag-level Callback Requests were not run (#13651)
- Stop creating duplicate Dag File Processors (#13662)
- Filter DagRuns with Task Instances in removed State while Scheduling (#13165)
- Bump
datatables.net
from 1.10.21 to 1.10.22 in /airflow/www (#13143) - Bump
datatables.net
JS to 1.10.23 (#13253) - Bump
dompurify
from 2.0.12 to 2.2.6 in /airflow/www (#13164) - Update minimum
cattrs
version (#13223) - Remove inapplicable arg 'output' for CLI pools import/export (#13071)
- Webserver: Fix the behavior to deactivate the authentication option and add docs (#13191)
- Fix: add support for no-menu plugin views (#11742)
- Add
python-daemon
limit for python 3.8+ to fix daemon crash (#13540) - Change the default celery
worker_concurrency
to 16 (#13612) - Audit Log records View should not contain link if
dag_id
is None (#13619) - Fix invalid
continue_token
for cleanup list pods (#13563) - Switches to latest version of snowflake connector (#13654)
- Fix backfill crash on task retry or reschedule (#13712)
- Setting
max_tis_per_query
to0
now correctly removes the limit (#13512) - Fix race conditions in task callback invocations (#10917)
- Fix webserver exiting when gunicorn master crashes (#13518)(#13780)
- Fix SQL syntax to check duplicate connections (#13783)
BaseBranchOperator
will push to xcom by default (#13704) (#13763)- Fix Deprecation for
configuration.getsection
(#13804) - Fix TaskNotFound in log endpoint (#13872)
- Fix race condition when using Dynamic DAGs (#13893)
- Fix: Linux/Chrome window bouncing in Webserver
- Fix db shell for sqlite (#13907)
- Only compare updated time when Serialized DAG exists (#13899)
- Fix dag run type enum query for mysqldb driver (#13278)
- Add authentication to lineage endpoint for experimental API (#13870)
- Do not add User role perms to custom roles. (#13856)
- Do not add
Website.can_read
access to default roles. (#13923) - Fix invalid value error caused by long Kubernetes pod name (#13299)
- Fix DB Migration for SQLite to upgrade to 2.0 (#13921)
- Bugfix: Manual DagRun trigger should not skip scheduled runs (#13963)
- Stop loading Extra Operator links in Scheduler (#13932)
- Added missing return parameter in read function of
FileTaskHandler
(#14001) - Bugfix: Do not try to create a duplicate Dag Run in Scheduler (#13920)
- Make
v1/config
endpoint respect webserverexpose_config
setting (#14020) - Disable row level locking for Mariadb and MySQL <8 (#14031)
- Bugfix: Fix permissions to triggering only specific DAGs (#13922)
- Fix broken SLA Mechanism (#14056)
- Bugfix: Scheduler fails if task is removed at runtime (#14057)
- Remove permissions to read Configurations for User and Viewer roles (#14067)
- Fix DB Migration from 2.0.1rc1
Improvements
- Increase the default
min_file_process_interval
to decrease CPU Usage (#13664) - Dispose connections when running tasks with
os.fork
&CeleryExecutor
(#13265) - Make function purpose clearer in
example_kubernetes_executor
example dag (#13216) - Remove unused libraries -
flask-swagger
,funcsigs
(#13178) - Display alternative tooltip when a Task has yet to run (no TI) (#13162)
- User werkzeug's own type conversion for request args (#13184)
- UI: Add
queued_by_job_id
&external_executor_id
Columns to TI View (#13266) - Make
json-merge-patch
an optional library and unpin it (#13175) - Adds missing LDAP "extra" dependencies to ldap provider. (#13308)
- Refactor
setup.py
to better reflect changes in providers (#13314) - Pin
pyjwt
and Add integration tests for Apache Pinot (#13195) - Removes provider-imposed requirements from
setup.cfg
(#13409) - Replace deprecated decorator (#13443)
- Streamline & simplify
__eq__
methods in models Dag and BaseOperator (#13449) - Additional properties should be allowed in provider schema (#13440)
- Remove unused dependency -
contextdecorator
(#13455) - Remove 'typing' dependency (#13472)
- Log migrations info in consistent way (#13458)
- Unpin
mysql-connector-python
to allow8.0.22
(#13370) - Remove thrift as a core dependency (#13471)
- Add
NotFound
response for DELETE methods in OpenAPI YAML (#13550) - Stop Log Spamming when
[core] lazy_load_plugins
isFalse
(#13578) - Display message and docs link when no plugins are loaded (#13599)
- Unpin restriction for
colorlog
dependency (#13176) - Add missing Dag Tag for Example DAGs (#13665)
- Support tables in DAG docs (#13533)
- Add
python3-openid
dependency (#13714) - Add
__repr__
for Executors (#13753) - Add description to hint if
conn_type
is missing (#13778) - Upgrade azure blob to v12 (#12188)
- Add extra field to
get_connnection
REST endpoint (#13885) - Make Smart Sensors DB Migration idempotent (#13892)
- Improve the error when DAG does not exist when running dag pause command (#13900)
- Update
airflow_local_settings.py
to fix an error message (#13927) - Only allow passing JSON Serializable conf to
TriggerDagRunOperator
(#13964) - Bugfix: Allow getting details of a DAG with null
start_date
(REST API) (#13959) - Add params to the DAG details endpoint (#13790)
- Make the role assigned to anonymous users customizable (#14042)
- Retry critical methods in Scheduler loop in case of
OperationalError
(#14032)
Doc only changes
- Add Missing Statsd Metrics in Docs (#13708)
- Add Missing Email configs in Configuration doc (#13709)
- Add quick start for Airflow on Docker (#13660)
- Describe which Python versions are supported (#13259)
- Add note block to 2.x migration docs (#13094)
- Add documentation about webserver_config.py (#13155)
- Add missing version information to recently added configs (#13161)
- API: Use generic information in UpdateMask component (#13146)
- Add Airflow 2.0.0 to requirements table (#13140)
- Avoid confusion in doc for CeleryKubernetesExecutor (#13116)
- Update docs link in REST API spec (#13107)
- Add link to PyPI Repository to provider docs (#13064)
- Fix link to Airflow master branch documentation (#13179)
- Minor enhancements to Sensors docs (#13381)
- Use 2.0.0 in Airflow docs & Breeze (#13379)
- Improves documentation regarding providers and custom connections (#13375)(#13410)
- Fix malformed table in production-deployment.rst (#13395)
- Update celery.rst to fix broken links (#13400)
- Remove reference to scheduler run_duration param in docs (#13346)
- Set minimum SQLite version supported (#13412)
- Fix installation doc (#13462)
- Add docs about mocking variables and connections (#13502)
- Add docs about Flask CLI (#13500)
- Fix Upgrading to 2 guide to use
rbac
UI (#13569) - Make docs clear that Auth can not be disabled for Stable API (#13568)
- Remove archived links from docs & add link for AIPs (#13580)
- Minor fixes in upgrading-to-2.rst (#13583)
- Fix Link in Upgrading to 2.0 guide (#13584)
- Fix heading for Mocking section in best-practices.rst (#13658)
- Add docs on how to use custom operators within plugins folder (#13186)
- Update docs to register Operator Extra Links (#13683)
- Improvements for database setup docs (#13696)
- Replace module path to Class with just Class Name (#13719)
- Update DAG Serialization docs (#13722)
- Fix link to Apache Airflow docs in webserver (#13250)
- Clarifies differences between extras and provider packages (#13810)
- Add information about all access methods to the environment (#13940)
- Docs: Fix FAQ on scheduler latency (#13969)
- Updated taskflow api doc to show dependency with sensor (#13968)
- Add deprecated config options to docs (#13883)
- Added a FAQ section to the Upgrading to 2 doc (#13979)
Airflow 2.0.0, 2020-12-17
The full changelog is about 3,000 lines long (already excluding everything backported to 1.10), so for now I’ll simply share some of the major features in 2.0.0 compared to 1.10.14:
A new way of writing dags: the TaskFlow API (AIP-31)
(Known in 2.0.0alphas as Functional DAGs.)
DAGs are now much much nicer to author especially when using PythonOperator. Dependencies are handled more clearly and XCom is nicer to use
A quick teaser of what DAGs can now look like:
from airflow.decorators import dag, task
from airflow.utils.dates import days_ago
@dag(default_args={'owner': 'airflow'}, schedule_interval=None, start_date=days_ago(2))
def tutorial_taskflow_api_etl():
@task
def extract():
return {"1001": 301.27, "1002": 433.21, "1003": 502.22}
@task
def transform(order_data_dict: dict) -> dict:
total_order_value = 0
for value in order_data_dict.values():
total_order_value += value
return {"total_order_value": total_order_value}
@task()
def load(total_order_value: float):
print("Total order value is: %.2f" % total_order_value)
order_data = extract()
order_summary = transform(order_data)
load(order_summary["total_order_value"])
tutorial_etl_dag = tutorial_taskflow_api_etl()
Fully specified REST API (AIP-32)
We now have a fully supported, no-longer-experimental API with a comprehensive OpenAPI specification
Read more here:
REST API Documentation.
Massive Scheduler performance improvements
As part of AIP-15 (Scheduler HA+performance) and other work Kamil did, we significantly improved the performance of the Airflow Scheduler. It now starts tasks much, MUCH quicker.
Over at Astronomer.io we’ve benchmarked the scheduler—it’s fast (we had to triple check the numbers as we don’t quite believe them at first!)
Scheduler is now HA compatible (AIP-15)
It’s now possible and supported to run more than a single scheduler instance. This is super useful for both resiliency (in case a scheduler goes down) and scheduling performance.
To fully use this feature you need Postgres 9.6+ or MySQL 8+ (MySQL 5, and MariaDB won’t work with more than one scheduler I’m afraid).
There’s no config or other set up required to run more than one scheduler—just start up a scheduler somewhere else (ensuring it has access to the DAG files) and it will cooperate with your existing schedulers through the database.
For more information, read the Scheduler HA documentation.
Task Groups (AIP-34)
SubDAGs were commonly used for grouping tasks in the UI, but they had many drawbacks in their execution behaviour (primarirly that they only executed a single task in parallel!) To improve this experience, we’ve introduced “Task Groups”: a method for organizing tasks which provides the same grouping behaviour as a subdag without any of the execution-time drawbacks.
SubDAGs will still work for now, but we think that any previous use of SubDAGs can now be replaced with task groups. If you find an example where this isn’t the case, please let us know by opening an issue on GitHub
For more information, check out the Task Group documentation.
Refreshed UI
We’ve given the Airflow UI a visual refresh and updated some of the styling. Check out the UI section of the docs for screenshots.
We have also added an option to auto-refresh task states in Graph View so you no longer need to continuously press the refresh button :).
Smart Sensors for reduced load from sensors (AIP-17)
If you make heavy use of sensors in your Airflow cluster, you might find that sensor execution takes up a significant proportion of your cluster even with “reschedule” mode. To improve this, we’ve added a new mode called “Smart Sensors”.
This feature is in “early-access”: it’s been well-tested by AirBnB and is “stable”/usable, but we reserve the right to make backwards-incompatible changes to it in a future release (if we have to. We’ll try very hard not to!)
Simplified KubernetesExecutor
For Airflow 2.0, we have re-architected the KubernetesExecutor in a fashion that is simultaneously faster, easier to understand, and more flexible for Airflow users. Users will now be able to access the full Kubernetes API to create a .yaml pod_template_file instead of specifying parameters in their airflow.cfg.
We have also replaced the executor_config dictionary with the pod_override parameter, which takes a Kubernetes V1Pod object for a 1:1 setting override. These changes have removed over three thousand lines of code from the KubernetesExecutor, which makes it run faster and creates fewer potential errors.
Airflow core and providers: Splitting Airflow into 60+ packages
Airflow 2.0 is not a monolithic “one to rule them all” package. We’ve split Airflow into core and 61 (for now) provider packages. Each provider package is for either a particular external service (Google, Amazon, Microsoft, Snowflake), a database (Postgres, MySQL), or a protocol (HTTP/FTP). Now you can create a custom Airflow installation from “building” blocks and choose only what you need, plus add whatever other requirements you might have. Some of the common providers are installed automatically (ftp, http, imap, sqlite) as they are commonly used. Other providers are automatically installed when you choose appropriate extras when installing Airflow.
The provider architecture should make it much easier to get a fully customized, yet consistent runtime with the right set of Python dependencies.
But that’s not all: you can write your own custom providers and add things like custom connection types, customizations of the Connection Forms, and extra links to your operators in a manageable way. You can build your own provider and install it as a Python package and have your customizations visible right in the Airflow UI.
Security
As part of Airflow 2.0 effort, there has been a conscious focus on Security and reducing areas of exposure. This is represented across different functional areas in different forms. For example, in the new REST API, all operations now require authorization. Similarly, in the configuration settings, the Fernet key is now required to be specified.
Configuration
Configuration in the form of the airflow.cfg file has been rationalized further in distinct sections, specifically around “core”. Additionally, a significant amount of configuration options have been deprecated or moved to individual component-specific configuration files, such as the pod-template-file for Kubernetes execution-related configuration.
We’ve tried to make as few breaking changes as possible and to provide deprecation path in the code, especially in the case of anything called in the DAG. That said, please read through UPDATING.md to check what might affect you. For example: We re-organized the layout of operators (they now all live under airflow.providers.*) but the old names should continue to work - you’ll just notice a lot of DeprecationWarnings that need to be fixed up.
Airflow 1.10.14, 2020-12-10
Bug Fixes
- BugFix: Tasks with
depends_on_past
ortask_concurrency
are stuck (#12663) - Fix issue with empty Resources in executor_config (#12633)
- Fix: Deprecated config
force_log_out_after
was not used (#12661) - Fix empty asctime field in JSON formatted logs (#10515)
- [AIRFLOW-2809] Fix security issue regarding Flask SECRET_KEY (#3651)
- [AIRFLOW-2884] Fix Flask SECRET_KEY security issue in www_rbac (#3729)
- [AIRFLOW-2886] Generate random Flask SECRET_KEY in default config (#3738)
- Add missing comma in setup.py (#12790)
- Bugfix: Unable to import Airflow plugins on Python 3.8 (#12859)
- Fix setup.py missing comma in
setup_requires
(#12880) - Don't emit first_task_scheduling_delay metric for only-once dags (#12835)
Improvements
- Update setup.py to get non-conflicting set of dependencies (#12636)
- Rename
[scheduler] max_threads
to[scheduler] parsing_processes
(#12605) - Add metric for scheduling delay between first run task & expected start time (#9544)
- Add new-style 2.0 command names for Airflow 1.10.x (#12725)
- Add Kubernetes cleanup-pods CLI command for Helm Chart (#11802)
- Don't let webserver run with dangerous config (#12747)
- Replace pkg_resources with importlib.metadata to avoid VersionConflict errors (#12694)
Doc only changes
- Clarified information about supported Databases