airflow db upgrade hangs against aws #27919
Replies: 9 comments
-
It looks like the upgrade failed also, so I didn't need db upgrade: [airflow@ ~]$ /data01/airflow/.pyenv/versions/3.9.12/bin/pip install 'apache-airflow[odbc]' --constraint file:///data01/airflow/airflow/constraints.txt [airflow@ ~]$ airflow version And its still 2.4.2, it seems. I succeeded in upgrading dev environment to 2.4.3 using same method a week ago. |
Beta Was this translation helpful? Give feedback.
-
I managed to upgrade to 2.4.3 by modifying constraint on apache-airflow-providers-common-sql from 1.2.0 to 1.3.0 and adding ==2.4.3 to the pip command, and after that the db upgrade also worked as intended |
Beta Was this translation helpful? Give feedback.
-
@Frank-T-Johansen That is a bit strange. |
Beta Was this translation helpful? Give feedback.
-
I guess someone should check if that gets called regardless of being configured or not. Or maybe there is no such local config file, and the test to see if it is configured is to talk to AWS SSM backend |
Beta Was this translation helpful? Give feedback.
-
This is a part of unit tests which as far as I know pass successfully. From your side you might check your airflow configuration by
|
Beta Was this translation helpful? Give feedback.
-
That gives no output, I guess because its not configured |
Beta Was this translation helpful? Give feedback.
-
Unfortunetly I've unable to reproduce this. version: '3'
networks:
no-internet-access:
driver: bridge
internal: true
volumes:
postgres-db-volume:
services:
postgres:
image: postgres:13
environment:
POSTGRES_USER: airflow
POSTGRES_PASSWORD: insecurepassword
POSTGRES_DB: airflow
volumes:
- postgres-db-volume:/var/lib/postgresql/data
healthcheck:
test: ["CMD", "pg_isready", "-U", "airflow", "-d", "airflow"]
interval: 5s
retries: 5
restart: on-failure
networks:
- no-internet-access
airflow-upgrade:
image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.4.3}
environment:
CONNECTION_CHECK_MAX_COUNT: "0"
AIRFLOW__CORE__EXECUTOR: LocalExecutor
AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:insecurepassword@postgres/airflow
# AIRFLOW__SECRETS__BACKEND: 'airflow.providers.amazon.aws.secrets.systems_manager.SystemsManagerParameterStoreBackend'
# AIRFLOW__DATABASE__SQL_ALCHEMY_CONN_SECRET: boom
command: ["db", "upgrade"]
networks:
- no-internet-access
depends_on:
postgres:
condition: service_healthy
airflow-version:
image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.4.3}
environment:
AIRFLOW__CORE__EXECUTOR: LocalExecutor
AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:insecurepassword@postgres/airflow
command: version
networks:
- no-internet-access
depends_on:
airflow-upgrade:
condition: service_completed_successfully |
Beta Was this translation helpful? Give feedback.
-
I guess its near impossible to reproduce, as it went away when upgrade completed. It might have happened due to a state where it had patched other packages to what constraint demanded out of 2.4.3 while still having 2.4.2, so basically a half-upgraded state. |
Beta Was this translation helpful? Give feedback.
-
Converting to discussion. What You can raise the issue to Interesting what constraints you have - but they do not seem to be the constraints of airlfow - they are your local modified version so I guess some conflicts there could have caused it (but if you install all packages there a few optional extras that trigger similar warnings - but if I think SSM might be some internal mechanism of boto library for example (but might also be one of other dependencies you added or your custom code in plugins or local settings, there are plenty of possibilities. Any package of yours or ours can perform any intiialisation it wants in As @Taragolis mentioned - airflow does not reach out to SSM if not told so, but any of the dependent libraries (or your libraries, or your code) could do it. It can be possibly triggered by some environment variables of yours so one way you could check it is to clean all the variables related to AWS and try again. If you run Python packages without configuration for AWS and without credentials, there is little probability any of them wil try to connect. I suggest you experiment with vars and let us know here if you find anything. |
Beta Was this translation helpful? Give feedback.
-
Apache Airflow version
2.4.3
What happened
I tried to run "airflow db upgrade" after ugprading airflow from 2.4.2 to 2.4.3, but the command just hung.
When I did strace it was trying to connect to AWS ssm, which is odd as my airflow installation is on a local server.
[root@ ~]# ps -ef|grep "db upgrade"
airflow 819024 819023 0 09:43 pts/1 00:00:02 /data01/airflow/.pyenv/versions/3.9.12/bin/python3.9 /data01/airflow/.pyenv/versions/3.9.12/bin/airflow db upgrade
root@ ~]# strace -f -p 819024
strace: Process 819024 attached with 8 threads
[...]
[pid 819024] connect(6, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("52.95.123.56")}, 16) = -1 EINPROGRESS (Operation now in progress)
[airflow@ ~]$ host ssm.eu-west-1.amazonaws.com
ssm.eu-west-1.amazonaws.com has address 52.95.123.56
Anyone have any clue as to why airflow would try to talk to AWS SSM from a local server? Could hackers have injected code to steal our secrets?
Btw, AWS is configured on this local server on the account where airflow is running, as it is being used for sending logs to s3.
But we don't use SSM for anything related to Airflow.
What you think should happen instead
"airflow db upgrade" shouldn't talk to AWS, and should not hang.
How to reproduce
Install 2.4.2, upgrade to 2.4.3, run airflow db upgrade
Operating System
RedHat 8.6
Versions of Apache Airflow Providers
apache-airflow-providers-amazon==6.1.0
apache-airflow-providers-common-sql==1.3.0
apache-airflow-providers-ftp==3.2.0
apache-airflow-providers-hashicorp==3.2.0
apache-airflow-providers-http==4.1.0
apache-airflow-providers-imap==3.0.0
apache-airflow-providers-odbc==3.1.2
apache-airflow-providers-oracle==3.5.0
apache-airflow-providers-slack==7.0.0
apache-airflow-providers-snowflake==4.0.0
apache-airflow-providers-sqlite==3.2.1
apache-airflow-providers-ssh==3.3.0
Deployment
Virtualenv installation
Deployment details
Python 3.9.12, using usual constraint file for airflow
Anything else
No response
Are you willing to submit PR?
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions