You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, file finding/reading is tightly coupled with DAG building in load_yaml_dags. I propose looser coupling or a hook to allow customizing the DAG via Python code, instead of directly translating the yaml.
DagFactory.__init__ can take either a yaml file path or a python dictionary. However, load_yaml_dags only allows passing file paths to DagFactory. It'd be nice to have a way to hook into it to pre-process the yaml file and pass a modified dict to DagFactory.
Basically, I see 3 parts here:
given a list of directories and recursive flag, find some files
I'd like to be able to tweak the "yaml DAG DSL" a bit for my application, instead of directly translating to Airflow DAG semantics - goal is to make it easier/less verbose for non-technical users & add certain features on top of base Airflow semantics without writing new operators & requiring users to understand them.
Basically, I only allow BashOperator (may eventually swap to DockerOperator) and I treat dag-factory as analogous to a Makefile, and I'd like to minimize the boilerplate required, as well as add additional task parameters (to specify which Bash runtime environment to use: ie, setting PATH/PYTHONPATH, etc).
Hi, @wearpants. You have some great ideas for improving the DAG factory. Would you like to have a call to discuss them next week? Thirty minutes may be good enough to brainstorm.
to follow up from our call, the kind of dsl i am imagining would allow me to abstract away/hide airflow specific constructs - dag factory feels like it could be the basis for a generic dag specification portable between different orchestrators, or even transpiled to a makefile or casey/just… For my purposes, the less my semi technical users need to understand airflow the better
Description
Currently, file finding/reading is tightly coupled with DAG building in
load_yaml_dags
. I propose looser coupling or a hook to allow customizing the DAG via Python code, instead of directly translating the yaml.DagFactory.__init__
can take either a yaml file path or a python dictionary. However,load_yaml_dags
only allows passing file paths toDagFactory
. It'd be nice to have a way to hook into it to pre-process the yaml file and pass a modified dict toDagFactory
.Basically, I see 3 parts here:
DagFactory
to build DAGsUse case/motivation
I'd like to be able to tweak the "yaml DAG DSL" a bit for my application, instead of directly translating to Airflow DAG semantics - goal is to make it easier/less verbose for non-technical users & add certain features on top of base Airflow semantics without writing new operators & requiring users to understand them.
Basically, I only allow
BashOperator
(may eventually swap toDockerOperator
) and I treatdag-factory
as analogous to a Makefile, and I'd like to minimize the boilerplate required, as well as add additional task parameters (to specify which Bash runtime environment to use: ie, setting PATH/PYTHONPATH, etc).Related issues
#289, #290, #297
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: