Skip to content

[Rust] Add stream builder API#155

Open
teodordelibasic-db wants to merge 2 commits intomainfrom
stream-builder-api
Open

[Rust] Add stream builder API#155
teodordelibasic-db wants to merge 2 commits intomainfrom
stream-builder-api

Conversation

@teodordelibasic-db
Copy link
Copy Markdown
Contributor

@teodordelibasic-db teodordelibasic-db commented Mar 23, 2026

What changes are proposed in this pull request?

This PR adds a typestate StreamBuilder API to the Rust SDK that replaces the existing create_stream* family of methods with a fluent, compile-time-safe builder.

The builder enforces a strict configuration order: auth -> format -> config -> build. Invalid combinations (building without auth, building without format, calling gRPC-only setters on an Arrow stream) are compile errors, not runtime errors.

let stream = sdk
    .stream_builder("catalog.schema.table")
    .oauth("client-id", "client-secret")
    .json()
    .max_inflight_requests(500_000)
    .build()
    .await?;

Motivation:

  • The current create_stream(TableProperties, client_id, client_secret, Option<StreamConfigurationOptions>) API is a wide parameter list where format is just a field inside StreamConfigurationOptions. Adding or removing fields is a breaking change.
  • The builder gives us a stable public surface where individual setters can be added/removed/deprecated independently.
  • This is the first step toward providing similar builder APIs in the wrapper SDKs (Python, Java, Go, TypeScript), each idiomatic for their language.

Changes:

New files:

  • rust/sdk/src/builder/stream_builder.rs — the StreamBuilder<F, A> typestate builder with format/auth marker generics, all setter methods, three format-specific build() impls, and 12 unit tests
  • rust/sdk/src/builder/stream_format.rs — marker types (NoFormat, Json, CompiledProto, Arrow, NoAuth, HasAuth), sealed traits (StreamFormat, GrpcFormat), and trait impls

Modified:

  • rust/sdk/src/lib.rs — added stream_builder() entry point on ZerobusSdk, deprecated all four create_stream* / create_arrow_stream* methods with since = "1.1.0", changed workspace_id / tls_config / get_or_create_channel_zerobus_client / ZerobusStream::new_stream to pub(crate) so the builder can access SDK internals, refactored recreate_stream and recreate_arrow_stream to call internal constructors directly instead of routing through deprecated methods, fixed crate-level quick-start doc example
  • rust/sdk/src/builder/mod.rs — added module declarations and re-exports for stream builder and format types
  • rust/sdk/src/headers_provider.rs — added NoOpHeadersProvider for local dev / testing / sidecar proxy scenarios (used by stream_builder(...).no_auth())
  • rust/examples/json/single/src/main.rs — migrated to builder API
  • rust/examples/json/batch/src/main.rs — migrated to builder API
  • rust/examples/proto/single/src/main.rs — migrated to builder API
  • rust/examples/proto/batch/src/main.rs — migrated to builder API
  • rust/CHANGELOG.md — added v1.1.0 release notes with new features, deprecations, and API changes
  • rust/README.md — rewrote stream creation section, custom auth example, callbacks example, recovery example, API reference, and architecture diagram to use the new builder API

Design decisions:

  • Auth before format ordering: keeps the typestate linear (one path, not combinatorial). Format-specific setters only appear after format is chosen.
  • build() forces record_type to match the typestate regardless of what .options() set, as a safety belt.
  • .options(StreamConfigurationOptions) is kept as a migration bridge for users transitioning from the old API. It replaces all prior setter values and is documented as such.
  • Sealed traits on StreamFormat / GrpcFormat prevent external implementations, preserving forward compatibility.
  • #[must_use] on StreamBuilder warns if a builder is created but never built.
  • Deprecated methods are kept functional for backward compatibility and for wrapper SDKs that still depend on them.

How is this tested?

12 unit tests in stream_builder.rs:

  • json_oauth_builder_compiles — verifies JSON + OAuth builder chain compiles
  • compiled_proto_headers_provider_compiles — verifies Proto + custom headers provider chain compiles
  • config_setters_chain — verifies all shared and gRPC-specific setters chain correctly
  • options_replaces_config — verifies .options() replaces prior setter values
  • setter_after_options_mutates_replacement — verifies setters after .options() modify the replacement config
  • default_config_without_setters — verifies sensible defaults when no setters are called
  • options_cannot_override_record_type — verifies .options() with wrong record_type is overridden at build time
  • no_auth_shortcut_compiles — verifies no_auth() convenience compiles
  • debug_impl_works — verifies Debug output includes builder state
  • noop_headers_provider_returns_empty — verifies NoOpHeadersProvider returns empty headers
  • resolve_headers_provider_with_custom_provider — verifies custom provider resolution returns expected headers
  • resolve_headers_provider_with_oauth — verifies OAuth provider resolution constructs without panic

Additionally:

  • cargo test -p databricks-zerobus-ingest-sdk passes (81 tests, including all pre-existing tests)
  • All 4 example crates compile with cargo check
  • Typestate correctness is enforced by the compiler: invalid combinations (e.g., calling .json() before auth, calling .build() on NoFormat) fail to compile with missing-method errors

Follow-up items

These are scoped out of this PR and tracked for subsequent work.

  1. Introduce a shared internal stream-open abstraction (e.g., StreamOpenRequest or ResolvedStreamConfig) that all stream creation paths target: the builder, the deprecated methods, the recreate methods, and the FFI/JNI/PyO3/NAPI wrapper bridges. This is the critical architectural step before removing deprecated methods in 2.0, because without it wrapper SDKs will need a second migration.

  2. Add integration tests for builder-based stream creation in rust/tests/src against the mock gRPC server. The current integration suite exercises stream lifecycle through the old constructors but does not yet cover the stream_builder() path.

  3. Add compile-fail tests (via trybuild) to verify that invalid typestate combinations produce compiler errors. Currently correctness is enforced by the type system but not explicitly tested.

  4. Typed stream returns: build() currently returns ZerobusStream for both JSON and Proto, so sending the wrong payload format is still a runtime error. A future version could return ZerobusJsonStream / ZerobusProtoStream (similar to what Java already has) to make format mismatches a compile error. Candidate for 2.0.

  5. Extract shared config fields (recovery, recovery_timeout_ms, recovery_backoff_ms, recovery_retries, server_lack_of_ack_timeout_ms, flush_timeout_ms) into a SharedStreamConfig struct that both StreamConfigurationOptions and ArrowStreamConfigurationOptions compose. Currently each shared setter dual-writes to both config structs.

  6. Add idiomatic stream builder APIs to wrapper SDKs (Python, Java, Go, TypeScript), each targeting the shared internal abstraction from item 1. These should not attempt to mirror the Rust typestate — each language should use its own idiomatic pattern (kwargs in Python, traditional builder in Java, functional options in Go, options object in TypeScript).

  7. Remove .options() escape hatch and deprecated create_stream* methods in 2.0, after wrapper SDKs are migrated.

Signed-off-by: teodordelibasic-db <teodor.delibasic@databricks.com>
Signed-off-by: teodordelibasic-db <teodor.delibasic@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant