Skip to content

[GSOC 2026] Airflow Contribution & Verification Agent Skills #62500

@jason810496

Description

@jason810496

Body

Background

Apache Airflow’s Breeze environment is the de facto way to reproduce CI, run tests, and verify changes locally. It encapsulates complex tooling (Docker, integrations, static checks, tests, system verification) behind a single, consistent developer interface.

However, modern AI coding tools (e.g. Claude Code, Gemini CLI, GitHub Copilot–style agents) currently treat Airflow’s repo like any generic Python project. They rarely:

  • Understand whether they are running inside or outside Breeze.
  • Choose the correct commands for host vs. container.
  • Follow the same workflows that Airflow contributors actually use (e.g. prek, breeze shell, breeze start-airflow).

We already expose some information through docs (e.g. AGENTS.md), but this mostly inflates the context window rather than giving agents a structured, machine-usable interface to Breeze.

This project aims to bridge that gap by creating an “Airflow Breeze Contribution / Contribution Verification” AI skill (final name TBD) that systematically encodes common contribution workflows and makes them reliably executable and testable by AI agents.

Goal

The overarching goal is to make AI tools:

Breeze-aware: able to detect whether they are running inside or outside Breeze and act accordingly.
In practice, this means that for a typical contributor PR, an AI agent can:

  • Run the right static checks.
  • Run the right subset of tests in Breeze.
  • Spin up Airflow and verify system behavior for a Dag representing the change (nice-to-have).
  • Do all of the above while respecting host/container boundaries.

Additionally, the solution should be consistency-focused, meaning that we want to keep Breeze CLI as the single source of truth for agent skills. This can be achieved by auto-syncing CLI docstrings and behaviors into the AI skill using existing tooling (e.g. prek), ensuring that the skill definitions always reflect the current state of the Breeze CLI.

Core Tasks

1. Environment Awareness & Detection
  • Design and implement a simple, robust mechanism for the agent skills to detect:
    • “Host” vs “inside Breeze container”.
    • Relevant environment variables, markers, or file paths that indicate context.
  • Encode decision logic for when to run:
    • Host-only commands (e.g. breeze shell, breeze start-airflow, git operations).
    • Container-only commands (e.g. pytest, airflow ...).
  • Provide a clear API/contract that AI tools can call to query current context and get recommended commands.

Note: Maybe we need to add some explicit markers, files in the repo, or write a small helper script that can be called to determine context in a reliable way. Or maybe we can rely on existing environment variables or filesystem cues. This is an open design question to explore.

2. Modeling Core Contributor Workflows as Skills

Based on the three scenarios described, define and implement skills that represent common contribution flows:

Scenario 1: Static checks pass

  • Stage changes (git add ...).
  • Run prek.
  • Collect and surface failures in a structured way so that an agent can fix them.

Scenario 2: Unit tests in Breeze

  • Start or attach to a Breeze container with breeze shell or breeze exec.
  • Run pytest with a targeted module/test path (not the whole suite).
  • Then the agent can inspect results and decide on next steps (e.g. fix code, exit Breeze).
3. Syncing with Breeze CLI as Source of Truth (via prek)
  • Investigate existing Breeze CLI docstrings and structure.
  • Define a mapping from Breeze commands (and their docstrings) to skill definitions, paths, and parameters.
  • Implement a prek hook that:
    • Generates or updates the agent skills definition files from Breeze CLI docstrings.
    • Fails when drift is detected (e.g. a command changed but the skill spec was not updated).
  • Integrate these checks into existing static check pipelines so the skills stay in sync automatically.
4. Evaluation & Test Harness
  • Design a testable user scenario or “exam” that simulates a typical contribution workflow (e.g. fixing a simple bug, adding a small feature) to verify that the added skills work as intended.
  • Add unit tests for any additional scripts or helper functions created.
5. Documentation & Developer Guide
  • Add or extend documentation (e.g. AGENTS.md, Breeze docs) to:
    • Describe the new Breeze-aware skills.
    • Show example workflows for human contributors and AI tools.
    • Document how other tools can integrate with the skills (e.g. path to spec file, key commands).

Advanced Tasks (Optional / Stretch Goals)

Scenario: System behavior verification

  • Write a Dag representing the feature/bugfix being contributed (or use an existing one).
  • Run breeze start-airflow (with --integration when needed).
  • Trigger the Dag via CLI (instead of UI) and wait for completion.
  • Inspect logs/status to determine success/failure from the TaskInstance logs.
  • Inspect logs/status from all the component services (scheduler, api-server, triggerer, etc) to determine if there are any underlying issues.
  • The agent can then decide to fix code, fix the Dag, or exit Breeze based on the results.

Expected Outcome

By the end of the project, we expect:

  • A Breeze-aware AI skill that can:
    • Detect host vs. container context.
    • Choose appropriate commands and environment transitions.
  • The AI toolings will be "smart-enough" to handle the core workflows for contributions, including:
    • Static checks with prek.
    • Targeted unit tests in Breeze.
    • Continue iterating based on results (e.g. fix code, fix tests, exit).
  • A sync mechanism (likely using prek) that:
    • Keeps Breeze CLI and the skill definitions in sync.
    • Fails CI when they diverge, ensuring Breeze remains the single source of truth.
  • Initial evaluation “exam(s)” and test harnesses that:
    • Verify that an implementation of the skill behaves correctly on at least the core scenarios.
  • Updated documentation explaining how contributors and AI tools can make use of the new capability.

A successful project will make it much easier for future AI tooling (IDEs, CLIs, bots) to interact with Breeze in a reliable and Airflow-native way, increasing contributor productivity and lowering the barrier to entry.

Recommended Skills

  • Programming & Tooling
    • Solid Python skills (CLI tools, packaging, basic testing).
    • Familiarity with Docker and containerized development environments.
    • Experience with writing or using CLIs and handling subprocesses.
  • Dev Workflow & CI
    • Understanding of typical open source contribution workflows (git, PRs, static checks, unit tests, pre-commit).
    • Exposure to CI systems and concepts of reproducible environments.
  • AI/Agents
    • Interest in or experience with AI coding assistants, Agent Skills, tool-calling, or agent frameworks.
    • Comfort reasoning about what “smart enough” means in terms of concrete, testable behaviors.
  • Airflow/Breeze (Nice to Have)
    • Basic knowledge of Apache Airflow concepts (Dags, tasks, operators).
    • Prior use of Breeze for development or testing is a plus, but not strictly required.

Motivation to work at the intersection of developer experience, tooling, and AI is more important than prior deep expertise in all of these areas.

Mentors

Learning Materials

Committer

  • I acknowledge that I am a maintainer/committer of the Apache Airflow project.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:dev-envCI, pre-commit, pylint and other changes that do not change the behavior of the final code

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions