Skip to content

Commit 34d4233

Browse files
ci: add kernel-e2e workflow + KERNEL_REV pin for use_kernel=True coverage (#808)
* ci: add kernel-e2e workflow + KERNEL_REV pin Wires up CI coverage for use_kernel=True. The kernel is a private repo with no published wheel, so we pin a kernel SHA in KERNEL_REV and build the wheel inline via maturin develop using the existing INTEGRATION_TEST_APP GitHub App (extended to include databricks/databricks-sql-kernel in its repo allowlist). Gate semantics mirror trigger-integration-tests.yml: - Plain PR events post a synthetic-success Kernel E2E check so the required check doesn't block PRs that don't touch kernel code. - The kernel-e2e label triggers a preview run on the PR and is auto-removed on synchronize for the same security reason as the integration-test label. - merge_group is the real gate — runs when kernel-relevant files change (src/databricks/sql/backend/kernel/, test_kernel_backend.py, KERNEL_REV, etc.), auto-passes otherwise. Unit tests are unchanged: tests/unit/test_kernel_*.py already runs in every code-quality-checks.yml matrix combo against a fake databricks_sql_kernel module injected at sys.modules import time. Required follow-up before this merges: 1. Extend the INTEGRATION_TEST_APP allowlist to include databricks/databricks-sql-kernel. 2. Create the kernel-e2e label in this repo. 3. Add Kernel E2E as a required check on main once a green run lands. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> * ci(kernel-e2e): add id-token: write for JFrog OIDC exchange setup-poetry runs setup-jfrog, which exchanges a GitHub OIDC token for a JFrog access token to reach the internal PyPI mirror. That needs id-token: write on the job, which was missing — the labelled preview run failed at setup-poetry with "ACTIONS_ID_TOKEN_REQUEST_TOKEN: unbound variable". Declared at both workflow scope and on run-kernel-e2e directly: a job-level permissions block fully overrides workflow scope, so the redundancy is intentional. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> * ci(kernel-e2e): build kernel wheel into connector venv, not a new one `poetry run maturin develop` from inside databricks-sql-kernel/pyo3/ makes poetry create a fresh, empty .venv next to the kernel source (it discovers pyo3/pyproject.toml first and treats it as the project root). That venv has no maturin → "Command not found: maturin". Resolve the connector venv's python path explicitly before changing working directory, then call maturin from that python via `-m maturin`. `--interpreter <path>` pins the produced wheel to the connector venv so the resulting extension is installed where pytest will look for it. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> * ci(kernel-e2e): drop --interpreter from maturin develop (not a valid flag) maturin develop installs into whichever python invoked it; the flag exists on `maturin build` only. The previous commit's extra `--interpreter $CONNECTOR_VENV_PY` was redundant — we're already calling maturin via `$CONNECTOR_VENV_PY -m maturin`, so the venv python is the one doing the build and install. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> * ci(kernel-e2e): route cargo through JFrog + audit cleanups databricks-protected-runner-group blocks direct egress to index.crates.io, so the maturin build was failing with SSL EOF on the cargo metadata step. Extend setup-jfrog with an opt-in `configure-cargo` input that writes ~/.cargo/config.toml + credentials.toml against the JFrog db-cargo-remote proxy (recipe borrowed verbatim from databricks-odbc's setup-jfrog action) and forward it through setup-poetry so the kernel-e2e workflow can enable it without bypassing the wrapper. Bundled cleanups from a workflow audit: - Drop the redundant `Set up Python 3.10` step — setup-poetry runs actions/setup-python internally at the matching version. - Smoke-check now uses `$CONNECTOR_VENV_PY` (same interpreter we built the wheel with), so a wheel installed into the wrong venv would surface here rather than be masked by `poetry run python` re-resolving. - Post `Kernel E2E` check on the labelled-PR path as well as the merge-queue path; previously the PR would still show the synthetic-success check forever even after a real labelled run failed. - Add a comment to fetch-depth: 0 explaining why we keep it. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> * ci(kernel-e2e): bump KERNEL_REV to current kernel main The original pin (aed2efb) predates kernel PR #36 which added `complex_types_as_json` to Session.__new__. Connector main already passes that kwarg (added in PR #795), so every e2e test was failing with: TypeError: Session.__new__() got an unexpected keyword argument 'complex_types_as_json' Bump to current kernel main (3aa25b21) which has the kwarg plus the rest of the comparator-parity changes the connector code already expects. This is a good demonstration of why the bisectable KERNEL_REV pin matters: the connector and kernel evolved in lockstep on `main` before this CI existed, so the very first thing the workflow does once it can actually build the wheel is catch that we'd been shipping a stale pin. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> * ci(kernel-e2e): disable bundled rust-cache in setup-rust-toolchain actions-rust-lang/setup-rust-toolchain invokes Swatinem/rust-cache internally, which runs `cargo metadata` from the workflow's working directory. Our job's CWD is the connector repo root (no Cargo.toml there — the kernel checkout is in a subdir), so the bundled cache attempt fails with exit 101 and dumps a Node stack trace into the log. It's cosmetic — the action handles its own errors — but reads as a failure on first glance, and the bundled cache races with the explicit rust-cache step we already configure with the correct `workspaces: databricks-sql-kernel` path. Disabling the bundled cache leaves a single, correctly-keyed rust-cache invocation and cleans up the log. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> --------- Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
1 parent 085cb56 commit 34d4233

4 files changed

Lines changed: 451 additions & 1 deletion

File tree

.github/actions/setup-jfrog/action.yml

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,15 @@
11
name: Setup JFrog OIDC
2-
description: Obtain a JFrog access token via GitHub OIDC and configure pip to use JFrog PyPI proxy
2+
description: Obtain a JFrog access token via GitHub OIDC and configure pip / cargo to use JFrog package proxies
3+
4+
inputs:
5+
configure-cargo:
6+
description: |
7+
Write ~/.cargo/config.toml + credentials.toml pointing at the
8+
Databricks JFrog Cargo proxy. Required for any job that runs
9+
`cargo` on `databricks-protected-runner-group`, where direct
10+
access to index.crates.io is blocked. Off by default because
11+
most jobs in this repo are Python-only.
12+
default: "false"
313

414
runs:
515
using: composite
@@ -30,3 +40,34 @@ runs:
3040
set -euo pipefail
3141
echo "PIP_INDEX_URL=https://gha-service-account:${JFROG_ACCESS_TOKEN}@databricks.jfrog.io/artifactory/api/pypi/db-pypi/simple" >> "$GITHUB_ENV"
3242
echo "pip configured to use JFrog registry"
43+
44+
- name: Configure Cargo
45+
if: inputs.configure-cargo == 'true'
46+
shell: bash
47+
# databricks-protected-runner-group blocks direct egress to
48+
# index.crates.io, so cargo must route through JFrog's
49+
# db-cargo-remote proxy. Mirrors the recipe used in
50+
# databricks-odbc's setup-jfrog action.
51+
#
52+
# Note: JFrog's Cargo proxy quarantines crates released within
53+
# the last 7 days. If a fresh dependency version isn't yet
54+
# mirrored, the build will fail until JFrog ingests it — bump
55+
# Cargo.lock to an older version or wait it out.
56+
run: |
57+
set -euo pipefail
58+
mkdir -p ~/.cargo
59+
cat > ~/.cargo/config.toml << 'EOF'
60+
[source.crates-io]
61+
replace-with = "jfrog"
62+
[source.jfrog]
63+
registry = "sparse+https://databricks.jfrog.io/artifactory/api/cargo/db-cargo-remote/index/"
64+
[registries.jfrog]
65+
index = "sparse+https://databricks.jfrog.io/artifactory/api/cargo/db-cargo-remote/index/"
66+
credential-provider = ["cargo:token"]
67+
EOF
68+
cat > ~/.cargo/credentials.toml << EOF
69+
[registries.jfrog]
70+
token = "Bearer ${JFROG_ACCESS_TOKEN}"
71+
EOF
72+
echo "CARGO_REGISTRIES_JFROG_TOKEN=Bearer ${JFROG_ACCESS_TOKEN}" >> "$GITHUB_ENV"
73+
echo "Cargo configured to use JFrog registry"

.github/actions/setup-poetry/action.yml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,12 +17,21 @@ inputs:
1717
description: Extra suffix for the cache key to avoid collisions across job variants
1818
required: false
1919
default: ""
20+
configure-cargo:
21+
description: |
22+
Forwarded to setup-jfrog. Set to "true" for jobs that also need
23+
Cargo configured against the JFrog crates proxy (e.g. anything
24+
that builds a Rust extension via maturin).
25+
required: false
26+
default: "false"
2027

2128
runs:
2229
using: composite
2330
steps:
2431
- name: Setup JFrog
2532
uses: ./.github/actions/setup-jfrog
33+
with:
34+
configure-cargo: ${{ inputs.configure-cargo }}
2635

2736
- name: Set up python ${{ inputs.python-version }}
2837
id: setup-python

0 commit comments

Comments
 (0)