-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Add DAG importer abstraction layer #60127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
cc @IKholopov |
0fa63e5 to
925bf9e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces a pluggable DAG importer architecture that enables support for multiple DAG file formats (e.g., YAML) while maintaining backwards compatibility with existing Python DAG files. The implementation follows the AIP-85 Extendable DAG parsing controls proposal.
Key Changes:
- Introduces an abstract
AbstractDagImporterbase class andDagImporterRegistryfor managing importers - Refactors Python DAG import logic into a dedicated
PythonDagImporterclass - Modifies
DagBag.process_file()to use the new importer registry instead of directly importing modules
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
airflow-core/src/airflow/dag_processing/importers/__init__.py |
Exports public API for the new importers module |
airflow-core/src/airflow/dag_processing/importers/base.py |
Defines abstract importer interface, registry singleton, and result/error data structures |
airflow-core/src/airflow/dag_processing/importers/python_importer.py |
Implements Python DAG importer by extracting and refactoring logic from DagBag |
airflow-core/src/airflow/dag_processing/dagbag.py |
Refactored to use the new importer registry pattern instead of direct module loading |
airflow-core/tests/unit/dag_processing/importers/__init__.py |
Test package initialization file |
airflow-core/tests/unit/dag_processing/importers/test_registry.py |
Comprehensive tests for the importer registry functionality |
airflow-core/tests/unit/dag_processing/test_dagbag.py |
Updated mock paths to reflect refactored code structure |
airflow-core/.pre-commit-config.yaml |
Updated pre-commit exclusion patterns for new importer files |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
airflow-core/src/airflow/dag_processing/importers/python_importer.py
Outdated
Show resolved
Hide resolved
55d10bd to
6dee754
Compare
ikholopov-omni
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the point of AIP-85 - LGTM as long as we will have the assumption "at most 1 importer per file extension" well documented and handled.
One deviation from AIP-85 is that filepaths are still handled by DagBag (and list_py_file_paths only lists py/zip paths). Do we have a definitive plan on how we want to handle it? Should it be handled by importer (somewhat makes sense, as it knows what extensions to look for) as was proposed in AIP-85 as well or by something else?
082d59c to
a8fc43c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
5c760ef to
7a490bf
Compare
Introduce a pluggable importer architecture for DAG files: - Add AbstractDagImporter base class and DagImporterRegistry - Add PythonDagImporter for .py and .zip files - Refactor DagBag.process_file() to use importer registry - Add DagImportResult/Error/Warning dataclasses for structured results This enables future support for non-Python DAG formats (YAML, etc.) while maintaining full backwards compatibility with existing behavior. All existing DagBag tests pass without modification.
b86cd92 to
e1d8848
Compare
ikholopov-omni
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this!
Introduces a pluggable importer architecture for DAG files. This enables future support for non-Python DAG formats (YAML, etc.) while maintaining full backwards compatibility with existing behavior.
This should make it easier for AIP-85 Extendable DAG parsing controls. So not documenting it purposely until then.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.