docs: add docs for kafka sink auto evolve option by guilleov · Pull Request #5824 · ClickHouse/clickhouse-docs

guilleov · 2026-03-23T22:55:17Z

Related to:

Check issue and PR description for a summary

vercel · 2026-03-23T22:55:22Z

@guilleov is attempting to deploy a commit to the ClickHouse Team on Vercel.

A member of the Team first needs to authorize it.

chernser · 2026-03-24T15:55:26Z

docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md

 | `bufferCount` (since v1.3.6)                    | Number of records to buffer in memory before flushing to ClickHouse. `0` disables internal buffering. Buffering is not supported with `exactlyOnce=true`.                                                                       | `"0"`                                                    |
 | `bufferFlushTime` (since v1.3.6)                | Maximum time in milliseconds to buffer records before flush when `exactlyOnce=false`. `0` disables time-based flushing. Default value is `0`. Only required for time-base threshold. Only effective when `bufferCount > 0`.                                                                                           | `"0"`                                                    |
 | `reportInsertedOffsets` (since v1.3.6)          | Enables returning only successfully inserted offsets from `preCommit` (instead of `currentOffsets`) when `exactlyOnce=false`. This does not apply when `ignorePartitionsWhenBatching=true`, where `currentOffsets` are still returned. | `"false"`                                                |
+| `auto.evolve` (since v1.3.7)                    | Automatically add columns to the ClickHouse table when incoming records contain new fields not present in the table. See [Schema Evolution](#schema-evolution). | `"false"`                                                |


please mention schema to be more clear what is evolving. May be even saying something like
schema.auto.column_creation - because in the future we would have column alteration and need to configure them separately.

chernser · 2026-03-24T15:58:55Z

docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md

+
+1. For each batch of records, the connector compares the record schema against the table's column list.
+2. If new fields are detected, it maps the Kafka Connect types to ClickHouse types and issues DDL.
+3. If multiple schema versions appear in a single batch, the batch is split at schema boundaries - each sub-batch is flushed and the table is evolved before continuing.


this is risky - what if we have a small batches by this split and will get too many parts as the result?
What blocks us from adding more then one column to a table?

chernser · 2026-03-24T16:00:30Z

docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md

+|---|---|---|
+| `org.apache.kafka.connect.data.Decimal` | `Decimal(38, S)` | Scale from schema parameters |
+| `org.apache.kafka.connect.data.Date` | `Date32` | |
+| `org.apache.kafka.connect.data.Time` | `Int64` | |


actually depends on version - latest version of ClickHouse support Time

chernser · 2026-03-24T16:03:40Z

docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md

+
+When creating new columns, the connector maps Connect types to ClickHouse types as follows:
+
+| Kafka Connect Type | ClickHouse Type | Notes |


Adding non-nullable columns will back backward compatibility and only records with none -null fields will be inserted. Please make it clear if Nullable(..) is really used.

chernser · 2026-03-24T16:06:25Z

docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md

+| `STRING` / `BYTES` | `String` | |
+| `ARRAY` | `Array(<element_type>)` | Recursive |
+| `MAP` | `Map(<key_type>, <value_type>)` | Recursive |
+| `STRUCT` | Not supported | Throws an error |


This is supported by our connector and should not throw an error
https://github.com/ClickHouse/clickhouse-kafka-connect/blob/main/src/main/java/com/clickhouse/kafka/connect/sink/db/ClickHouseWriter.java#L574

Besides STRUCT used for unions like in case of union(String, bytes) https://github.com/ClickHouse/clickhouse-kafka-connect/blob/main/src/main/java/com/clickhouse/kafka/connect/sink/db/ClickHouseWriter.java#L251

chernser · 2026-03-24T16:07:32Z

docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md

+| `MAP` | `Map(<key_type>, <value_type>)` | Recursive |
+| `STRUCT` | Not supported | Throws an error |
+
+Optional (nullable) fields are wrapped in `Nullable(...)`, except for `ARRAY` and `MAP` types which [cannot be Nullable in ClickHouse](/sql-reference/data-types/nullable). Elements and values inside composite types can still be Nullable.


optional may have some default value, but as stated before we have to create Nullable columns to not break insert of older records.

chernser · 2026-03-24T16:09:15Z

docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md

+
+The connector rejects schema evolution in the following cases with a clear error message:
+
+- **Non-nullable field without a default value** - ClickHouse requires new columns to be either `Nullable` or have a `DEFAULT`.


Please describe how it should be configured:

specific avro/protobuf schema options

sink configuration

chernser · 2026-03-24T16:10:07Z

docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md

+The connector rejects schema evolution in the following cases with a clear error message:
+
+- **Non-nullable field without a default value** - ClickHouse requires new columns to be either `Nullable` or have a `DEFAULT`.
+- **STRUCT fields** - Mapping Connect STRUCT to ClickHouse is non-trivial (could be Tuple, JSON, or Nested). Not supported for auto-evolution.


in most cases JSON is used - and it is easy to add as an option.

chernser · 2026-03-24T16:11:18Z

docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md

+
+- **Non-nullable field without a default value** - ClickHouse requires new columns to be either `Nullable` or have a `DEFAULT`.
+- **STRUCT fields** - Mapping Connect STRUCT to ClickHouse is non-trivial (could be Tuple, JSON, or Nested). Not supported for auto-evolution.
+- **Schemaless or string records** - No Connect schema is available to derive ClickHouse types. Evolution is skipped with a warning.


should throw an error and configuration doc should state it clear that evolution is available with schema only.

chernser · 2026-03-24T16:14:29Z

docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md

+
+Schema evolution is safe to use with multiple connector tasks. `ADD COLUMN IF NOT EXISTS` is idempotent - if two tasks race to add the same column, both succeed silently. DDL statements are executed with [`alter_sync=1`](/sql-reference/statements/alter#synchronicity-of-alter-queries) to wait for the local replica to apply the change. A retry loop on `DESCRIBE TABLE` (5 retries, 200ms backoff) handles propagation to other replicas.
+
+#### Limitations {#schema-evolution-limitations}


this should be on the top. As you might see I've already asked questions that are covered here.

guilleov · 2026-03-26T08:01:25Z

@chernser made some changes here also since some parts are modified on the other PR

docs: add docs for kafka sink auto evolve option

1df9a37

guilleov requested review from a team as code owners March 23, 2026 22:55

guilleov mentioned this pull request Mar 23, 2026

Feature/schema evolution ClickHouse/clickhouse-kafka-connect#687

Open

BentsiLeviav added the Don't Merge Don't merge yet label Mar 24, 2026

chernser reviewed Mar 24, 2026

View reviewed changes

chernser requested changes Mar 24, 2026

View reviewed changes

docs: kafka-clickhouse-connect-sink schema evolution changes

73f24d0

guilleov requested a review from chernser March 26, 2026 08:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add docs for kafka sink auto evolve option#5824

docs: add docs for kafka sink auto evolve option#5824
guilleov wants to merge 2 commits intoClickHouse:mainfrom
guilleov:docs/kafka-connect-auto-evolve-issue-277

guilleov commented Mar 23, 2026

Uh oh!

vercel bot commented Mar 23, 2026

Uh oh!

chernser Mar 24, 2026 •

edited

Loading

Uh oh!

chernser Mar 24, 2026

Uh oh!

chernser Mar 24, 2026

Uh oh!

chernser Mar 24, 2026

Uh oh!

chernser Mar 24, 2026

Uh oh!

chernser Mar 24, 2026

Uh oh!

chernser Mar 24, 2026

Uh oh!

chernser Mar 24, 2026

Uh oh!

chernser Mar 24, 2026

Uh oh!

chernser Mar 24, 2026

Uh oh!

guilleov commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		When creating new columns, the connector maps Connect types to ClickHouse types as follows:

		\| Kafka Connect Type \| ClickHouse Type \| Notes \|


		The connector rejects schema evolution in the following cases with a clear error message:

		- Non-nullable field without a default value - ClickHouse requires new columns to be either `Nullable` or have a `DEFAULT`.


		Schema evolution is safe to use with multiple connector tasks. `ADD COLUMN IF NOT EXISTS` is idempotent - if two tasks race to add the same column, both succeed silently. DDL statements are executed with [`alter_sync=1`](/sql-reference/statements/alter#synchronicity-of-alter-queries) to wait for the local replica to apply the change. A retry loop on `DESCRIBE TABLE` (5 retries, 200ms backoff) handles propagation to other replicas.

		#### Limitations {#schema-evolution-limitations}

Conversation

guilleov commented Mar 23, 2026

Uh oh!

vercel bot commented Mar 23, 2026

Uh oh!

chernser Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

guilleov commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chernser Mar 24, 2026 •

edited

Loading