Skip to content

fix(google): extract Cloud Logging labels from AF3 log path when task_instance is missing in supervisor context#68246

Open
goingforstudying-ctrl wants to merge 1 commit into
apache:mainfrom
goingforstudying-ctrl:fix/stackdriver-labels-from-path
Open

fix(google): extract Cloud Logging labels from AF3 log path when task_instance is missing in supervisor context#68246
goingforstudying-ctrl wants to merge 1 commit into
apache:mainfrom
goingforstudying-ctrl:fix/stackdriver-labels-from-path

Conversation

@goingforstudying-ctrl

Copy link
Copy Markdown

Ran into this while debugging a Cloud Logging setup on GKE — log entries were landing in Stackdriver with empty labels, making it impossible to filter by dag_id or task_id.

Turns out the StackdriverRemoteLogIO.processors proc() closure reads record.task_instance to populate labels, but in AF3's supervisor model the REMOTE_TASK_LOG handler runs in the supervisor process where that attribute is never set. So every log entry from the supervisor just gets empty labels.

This grabs dag_id, task_id, and try_number from the log path instead. AF3's log path template is dag_id=<x>/run_id=<x>/task_id=<x>/attempt=<N>.log — all four fields are already in the path with zero DB access needed.

The fallback only kicks in when task_instance is genuinely missing, so the task-subprocess code path (where task_instance is available) is untouched.

Not sure if run_id should also be turned into a label here — left it out for now since the existing label set doesn't include it and the read-side filtering (bug 2) will need its own fix anyway. Happy to add it if maintainers think it belongs.

relates to #68240

@boring-cyborg boring-cyborg Bot added area:logging area:providers provider:google Google (including GCP) related issues labels Jun 8, 2026
@boring-cyborg

boring-cyborg Bot commented Jun 8, 2026

Copy link
Copy Markdown

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our prek-hooks will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example Dag that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

In Airflow 3 the supervisor process runs REMOTE_TASK_LOG handlers, but
record.task_instance is never set in supervisor context (it is a
task-subprocess concept).  When task_instance is missing the proc()
closure shipped log entries with empty labels, making Cloud Logging
entries unsearchable by dag_id / task_id.

Parse dag_id, task_id, and try_number from the structured AF3 log path
(dag_id=<x>/run_id=<x>/task_id=<x>/attempt=<N>.log) instead.  This
requires zero DB access and works regardless of whether the handler
runs in a task subprocess or the supervisor.

relates to apache#68240
@goingforstudying-ctrl goingforstudying-ctrl force-pushed the fix/stackdriver-labels-from-path branch from 83bc8a2 to 1764fd4 Compare June 10, 2026 06:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:logging area:providers provider:google Google (including GCP) related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant