Skip to content

feat: add resource-scoped jobs plugin for Databricks Lakeflow Jobs#223

Closed
keugenek wants to merge 24 commits intodatabricks:mainfrom
keugenek:ekniazev/jobs-plugin-core
Closed

feat: add resource-scoped jobs plugin for Databricks Lakeflow Jobs#223
keugenek wants to merge 24 commits intodatabricks:mainfrom
keugenek:ekniazev/jobs-plugin-core

Conversation

@keugenek
Copy link
Copy Markdown
Contributor

@keugenek keugenek commented Mar 31, 2026

Summary

Resource-scoped jobs plugin following the files plugin pattern. Jobs are configured as named resources discovered from environment variables at startup.

Design

  • Resource-scoped: Only configured jobs are accessible — this is not an open SDK wrapper
  • Env-var discovery: Jobs are discovered from DATABRICKS_JOB_<KEY> env vars (e.g. DATABRICKS_JOB_ETL=123)
  • Single-job shorthand: DATABRICKS_JOB_ID maps to the "default" key
  • Manifest declares job resources with CAN_MANAGE_RUN permission
  • Works with databricks apps init --features jobs

API

// Trigger a configured job
const { run_id } = await appkit.jobs("etl").runNow();

// Trigger and wait for completion
const run = await appkit.jobs("etl").runNowAndWait();

// OBO access
await appkit.jobs("etl").asUser(req).runNow();

// List recent runs
const runs = await appkit.jobs("etl").listRuns({ limit: 10 });

// Single-job shorthand
await appkit.jobs("default").runNow();

Files changed

  • plugins/jobs/manifest.json — declares job resource with CAN_MANAGE_RUN permission
  • plugins/jobs/types.tsJobAPI, JobHandle, JobsExport, IJobsConfig types
  • plugins/jobs/plugin.tsJobsPlugin with discoverJobs(), getResourceRequirements(), resource-scoped createJobAPI()
  • plugins/jobs/index.ts — barrel exports
  • connectors/jobs/client.tslistRuns now respects limit parameter
  • plugins/jobs/tests/plugin.test.ts — 32 tests covering discovery, resource requirements, exports, OBO, multi-job, and auto-fill

Documentation safety checklist

  • Examples use least-privilege permissions
  • Sensitive values are obfuscated
  • No insecure patterns introduced

Copy link
Copy Markdown
Contributor

@atilafassina atilafassina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found also 2 potential type mismatches with protobuf

  1. @bufbuild/protobuf toJson() defaults to camelCase. The useProtoFieldName: true option is needed for snake_case as documented.

  2. Partial<MessageShape<T>> allows setting properties to undefined explicitly, whereas PartialMessage<T> from protobuf properly handles nested partial messages and enum defaults.

@keugenek keugenek force-pushed the ekniazev/jobs-plugin-core branch from 9715283 to 11f7d46 Compare April 2, 2026 10:38
pkosiec
pkosiec previously requested changes Apr 2, 2026
Copy link
Copy Markdown
Member

@pkosiec pkosiec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keugenek Took a quick high-level look and I think we should take a step back and think about the user flow for this plugin. Right now it is a light wrapper over the JS SDK client and effectively re-exports the API. I don't see any significant benefits to use plugin over the SDK :thinking_face:

For all the current plugins, a user can do databricks apps init and select a plugin and interactively all the required resources (or pass them with a flag), and then we put the selection in databricks.yml and .env for seamless local development and zero-config deployment.

I'd assume it should work similarly with the jobs plugin - which means, it would need to have a job-oriented API (multiple jobs can be supported too, but in the interactive selection, we'd support one - similarly as in the model serving API).

So I think we need to agree on the API first, and then we need to have template changes, and test the existing apps init flow.

Let me know if that makes sense! Feel free to dismiss my review if I'd be OOO, but I'd really love to rework the plugin to be more user-oriented. Thank you!

Jobs are configured as named resources (DATABRICKS_JOB_<KEY> env vars)
and discovered at startup, following the files plugin pattern.

API is scoped to configured jobs:
  appkit.jobs('etl').runNow()
  appkit.jobs('etl').runNowAndWait()
  appkit.jobs('etl').lastRun()
  appkit.jobs('etl').listRuns()
  appkit.jobs('etl').asUser(req).runNow()

Single-job shorthand via DATABRICKS_JOB_ID env var.
Supports OBO access via asUser(req).

Co-authored-by: Isaac
Signed-off-by: Evgenii Kniazev <evgenii.kniazev@databricks.com>
@keugenek keugenek force-pushed the ekniazev/jobs-plugin-core branch from 11f7d46 to b0e3be3 Compare April 2, 2026 16:04
@keugenek keugenek changed the title feat: add jobs connector and plugin for Databricks Jobs API feat: add resource-scoped jobs plugin for Databricks Lakeflow Jobs Apr 2, 2026
@pkosiec pkosiec dismissed their stale review April 2, 2026 16:35

Dismissing after the last commit

@pkosiec
Copy link
Copy Markdown
Member

pkosiec commented Apr 2, 2026

Thanks very much @keugenek! I didn't check the PR details but the high-level changes are exactly what I was looking for. Dismissed my review before going offline to avoid blocking merge 👍
image

Thanks again for making those changes!

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
…nfig

Phase 1 of jobs plugin rework: establishes config foundation with
TaskType union, three interceptor tiers (read/write/stream), and
pure parameter mapping for all 6 task types.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
…reaming

Wave 2 of jobs plugin rework:
- Fix connector protobuf types (Waiter access pattern)
- Remove waitForRun from connector, add runAndWait async generator in plugin
- Wrap reads in execute() with JOBS_READ_DEFAULTS, writes with JOBS_WRITE_DEFAULTS
- Add Zod param validation and mapParams integration to runNow
- Enrich clientConfig() with JSON Schema per job
- Update JobAPI types and adapt existing tests

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Phase 5 of jobs plugin rework: adds injectRoutes() with 4 endpoints:
- POST /:jobKey/run (with ?stream=true SSE support)
- GET /:jobKey/runs (paginated)
- GET /:jobKey/runs/:runId (detail)
- GET /:jobKey/status (latest run)
All routes use OBO via asUser(req) and validate job keys.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
- Fix ERR_HTTP_HEADERS_SENT crash when streaming errors occur after
  headers are flushed (check res.headersSent before setting status)
- Fix spark_jar/sql param mapping to use correct SDK fields
  (jar_params, sql_params instead of parameters)
- Forward abort signal from execute() interceptors to all connector
  calls for proper timeout/cancellation support
- Use Zod result.data after validation to preserve schema
  transforms and defaults
- Validate params even when omitted if a schema is configured
- Fix lastRun return type from Run to BaseRun (matches listRuns)
- Fix getJobId error message for default job key

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
The asUser() proxy wraps createJobAPI() in user context, but the
returned JobAPI closures called this.client after the ALS scope
exited, falling back to service principal. Fix by capturing the
client and userId eagerly at creation time and passing them
explicitly to execute() and connector calls.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Migrate streaming to executeStream(), route polling through execute(),
use ValidationError, standardize eager captures, tighten types, expose
taskType in clientConfig, remove .job() alias, update manifest docs.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
…ms before streaming

- Cap listRuns to [1, 100] in both connector and HTTP route to prevent
  unbounded memory materialization
- Reject negative runId (<=0) in GET/DELETE run handlers
- Validate Zod params before entering SSE stream branch so bad params
  return 400 JSON instead of a generic SSE error event

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
@atilafassina atilafassina force-pushed the ekniazev/jobs-plugin-core branch from bf4061a to ad2bfae Compare April 10, 2026 14:15
@atilafassina atilafassina changed the base branch from feat/proto-plugin to main April 10, 2026 14:29
keugenek and others added 7 commits April 10, 2026 16:33
Jobs are configured as named resources (DATABRICKS_JOB_<KEY> env vars)
and discovered at startup, following the files plugin pattern.

API is scoped to configured jobs:
  appkit.jobs('etl').runNow()
  appkit.jobs('etl').runNowAndWait()
  appkit.jobs('etl').lastRun()
  appkit.jobs('etl').listRuns()
  appkit.jobs('etl').asUser(req).runNow()

Single-job shorthand via DATABRICKS_JOB_ID env var.
Supports OBO access via asUser(req).

Co-authored-by: Isaac
Signed-off-by: Evgenii Kniazev <evgenii.kniazev@databricks.com>
Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
…nfig

Phase 1 of jobs plugin rework: establishes config foundation with
TaskType union, three interceptor tiers (read/write/stream), and
pure parameter mapping for all 6 task types.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
…reaming

Wave 2 of jobs plugin rework:
- Fix connector protobuf types (Waiter access pattern)
- Remove waitForRun from connector, add runAndWait async generator in plugin
- Wrap reads in execute() with JOBS_READ_DEFAULTS, writes with JOBS_WRITE_DEFAULTS
- Add Zod param validation and mapParams integration to runNow
- Enrich clientConfig() with JSON Schema per job
- Update JobAPI types and adapt existing tests

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Phase 5 of jobs plugin rework: adds injectRoutes() with 4 endpoints:
- POST /:jobKey/run (with ?stream=true SSE support)
- GET /:jobKey/runs (paginated)
- GET /:jobKey/runs/:runId (detail)
- GET /:jobKey/status (latest run)
All routes use OBO via asUser(req) and validate job keys.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
- Fix ERR_HTTP_HEADERS_SENT crash when streaming errors occur after
  headers are flushed (check res.headersSent before setting status)
- Fix spark_jar/sql param mapping to use correct SDK fields
  (jar_params, sql_params instead of parameters)
- Forward abort signal from execute() interceptors to all connector
  calls for proper timeout/cancellation support
- Use Zod result.data after validation to preserve schema
  transforms and defaults
- Validate params even when omitted if a schema is configured
- Fix lastRun return type from Run to BaseRun (matches listRuns)
- Fix getJobId error message for default job key

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
The asUser() proxy wraps createJobAPI() in user context, but the
returned JobAPI closures called this.client after the ALS scope
exited, falling back to service principal. Fix by capturing the
client and userId eagerly at creation time and passing them
explicitly to execute() and connector calls.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Migrate streaming to executeStream(), route polling through execute(),
use ValidationError, standardize eager captures, tighten types, expose
taskType in clientConfig, remove .job() alias, update manifest docs.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
…ms before streaming

- Cap listRuns to [1, 100] in both connector and HTTP route to prevent
  unbounded memory materialization
- Reject negative runId (<=0) in GET/DELETE run handlers
- Validate Zod params before entering SSE stream branch so bad params
  return 400 JSON instead of a generic SSE error event

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
@atilafassina atilafassina force-pushed the ekniazev/jobs-plugin-core branch from ad2bfae to fbe0dac Compare April 10, 2026 14:34
…s-plugin-core

# Conflicts:
#	packages/appkit/src/index.ts
execute() returns ExecutionResult<T> (a discriminated union), but the
jobs plugin was treating it as plain T. This caused 9 TypeScript errors
and 6 test failures. Each call site now checks result.ok before
accessing result.data, matching the pattern established by the serving
plugin.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Merge remote branch and fix execute() call sites: execute() returns
ExecutionResult<T> (discriminated union), but the jobs plugin treated
results as plain T. Each call site now checks result.ok before accessing
result.data, matching the pattern from the serving plugin.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
@atilafassina
Copy link
Copy Markdown
Contributor

Reopening from upstream branch to fix CI secrets access (fork PRs can't access JFrog token)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants