Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/integrations/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@ ClickHouse integrations are organized by their support level:
| Partner integrations | Built or maintained, and supported by, third-party software vendors |
| Community integrations | Built or maintained and supported by community members. No direct support is available besides the public GitHub repositories and community Slack channels |

:::tip Building a ClickHouse integration?
If you are building a product that connects to ClickHouse, start with the [Integration development](/integrations/integration-development) guides: [building integrations](/integrations/integration-development/building-integrations), [testing](/integrations/integration-development/testing-your-integration), and [documenting your product](/integrations/integration-development/documenting-your-integration). When the [partner portal](https://clickhouse.com/partners) is live, [sign up](https://clickhouse.com/partners) to register your integration.
:::

<IntegrationGrid />

:::note Notice
Expand Down
7 changes: 7 additions & 0 deletions docs/integrations/integration-development/_category_.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
position: 500
label: 'Integration development'
collapsible: true
collapsed: true
link:
type: doc
id: integrations/integration-development/index
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
---
slug: /integrations/integration-development/building-integrations
title: 'Building integrations with ClickHouse'
sidebar_label: 'Building integrations'
sidebar_position: 2
description: 'Orientation on ingestion, consumption, wire protocols, and client conventions for ClickHouse integrations.'
keywords: ['partner', 'integration', 'ingestion', 'consumption', 'ClickPipes', 'language clients', 'user-agent']
doc_type: 'guide'
---

# Building integrations with ClickHouse

This page orients you to the integration surface so you can scope ingestion and consumption work. For validation and publishing, continue with [Testing your integration](/integrations/integration-development/testing-your-integration) and [Documenting your integration](/integrations/integration-development/documenting-your-integration).

## Ingestion {#ingestion}

Two paths bring data into ClickHouse. Choose based on whether your product should own the ingestion plane or delegate it.

### Path A: ClickPipes (managed, ClickHouse Cloud only) {#path-a-clickpipes}

If you prefer not to build and operate ingestion infrastructure, [ClickPipes](/integrations/clickpipes) is the managed service that pulls from your customer's sources into their ClickHouse Cloud service. ClickPipes handles scaling, parallelization, retries, and lag reporting.

Supported sources today include:

- **Streaming:** Apache Kafka (including MSK, Confluent Cloud, Redpanda, Azure Event Hubs, WarpStream), Amazon Kinesis

Check notice on line 25 in docs/integrations/integration-development/building-integrations.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.Uppercase

Suggestion: Instead of uppercase for 'MSK', use lowercase or backticks (`) if possible. Otherwise, ask a Technical Writer to add this word or acronym to the rule's exception list.
- **Object storage:** Amazon S3 (and S3-compatible stores), Google Cloud Storage, Azure Blob Storage
- **CDC:** PostgreSQL, MySQL, MongoDB, BigQuery

### Path B: Self-driven ingestion via an official language client {#path-b-language-client}

If you own the pipeline, use one of the [official language clients](/integrations/language-clients). They handle serialization, batching, TLS, compression, and connection pooling. You pass runtime primitives; the client handles the wire format.

- Official clients: Python, Go, Java, JavaScript, Rust, C#, C++
- Both wire protocols: HTTP and native TCP (Go and C++)
- Auth: username and password over TLS by default; mTLS and SSL client-certificate auth are supported by all major clients
- Data format is usually an implementation detail. Clients convert runtime types to ClickHouse Native or RowBinary format. If you already produce Arrow, Parquet, JSONEachRow, or another format, most clients expose a raw-bytes API for pre-serialized data
- For throughput, batch **10K–100K rows** and aim for roughly **one insert per second** as an upper bound for synchronous inserts. If client-side batching is impractical, use [asynchronous inserts](/optimize/asynchronous-inserts) to shift batching to the server

See also: [Bulk inserts](/optimize/bulk-inserts).

Check warning on line 39 in docs/integrations/integration-development/building-integrations.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.Colons

': B' should be in lowercase.

## Consumption {#consumption}

HTTP and native TCP both carry queries. Native is binary and lower overhead. HTTP works through load balancers and proxies. Both are first-class; pick based on infrastructure, not feature gaps.

- **Application code:** use the same [official language clients](/integrations/language-clients) as for ingestion
- **BI and SQL tools:** ClickHouse ships an official [JDBC v2 driver](/integrations/java) (Java) and an [ODBC driver](/interfaces/odbc). Tableau, Looker, Power BI, Metabase, Apache Superset, and Grafana integrate via these drivers or dedicated connectors maintained by ClickHouse and partners

Check warning on line 46 in docs/integrations/integration-development/building-integrations.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.OxfordComma

Use a comma before the last 'and' or 'or' in a list of four or more items.
- **Result format:** clients typically own serialization. You can request Arrow, Parquet, or other columnar formats on the wire if your product needs them

### Result-set sizing {#result-set-sizing}

Most analytical queries return small result sets (aggregates, summaries, top-N), and the wire is rarely the bottleneck. ClickHouse tables can hold billions of rows, and an unbounded `SELECT *` over a large fact table can move terabytes. **Shape the request in your application:** use `LIMIT`, pagination, streaming reads, and explicit column lists. If you build user-facing analytics, treat unbounded result sets as a UX problem, not a transport problem.

ClickHouse has a rich type system: arrays, tuples, maps, JSON, nested, LowCardinality, and more. Official clients map these to idiomatic language types. If your product surfaces ClickHouse data to end users, plan a type-mapping strategy early.

## Next steps {#next-steps}

Pick a path and prototype against a [ClickHouse Cloud trial](https://clickhouse.com/cloud). When the `partner portal` is available, register your integration there.

## User-agent string convention {#user-agent-string-convention}
Comment thread
mshustov marked this conversation as resolved.

HTTP clients should set a `User-Agent` string that identifies your integration. ClickHouse parses this server-side to track adoption, surface usage telemetry, and inform the roadmap.

Format:

```text

Check notice on line 65 in docs/integrations/integration-development/building-integrations.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.CodeblockFences

Suggestion: Instead of '```text' for the code block, use yaml, ruby, plaintext, markdown, javascript, shell, go, python, dockerfile, or typescript.
<app_name>/<app_version> <client_name>/<client_version> (<comment>; <key1>: <value1>; <key2>: <value2>)
```

Examples:

- `clickhouse-java/0.8.0`
- `my-analytics-app/3.1.2 clickhouse-js/1.2.0 (env: staging; region: us-east-1; lv: node/20.10)`

Rules:

- No whitespace in client name or version
- If you include a comment, it must come first
- Standard metadata keys: `lv` (language or framework version), `os`, `arch`
- TCP and native protocol clients report client name and version via protocol fields, not `User-Agent`

If you use JDBC, see [client identification](/integrations/language-clients/java/jdbc#client-identification) for how the driver sets `User-Agent` and related fields.

## Sandbox and trial access {#sandbox-and-trial-access}

[ClickHouse Cloud](https://clickhouse.com/cloud) offers a free trial for development and integration validation. If you are a House Mate partner, you can request additional development credits through the [partner portal](https://clickhouse.com/partners).
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
---
slug: /integrations/integration-development/documenting-your-integration
sidebar_label: 'Documenting your integration'
sidebar_position: 4
title: 'Documenting your ClickHouse integration'
description: 'How to contribute integration pages to clickhouse-docs, including required sections and a copy-paste skeleton.'
keywords: ['partner', 'integration', 'documentation', 'contributing', 'pull request', 'integration docs']
doc_type: 'guide'
---

# Documenting your ClickHouse integration

Integration documentation on this site gives end users one place to scope and troubleshoot setups. This page describes what to include, where files go, and how to open a pull request.

Start with [Building integrations](/integrations/integration-development/building-integrations) and [Testing your integration](/integrations/integration-development/testing-your-integration) if you have not already.

Check notice on line 15 in docs/integrations/integration-development/documenting-your-integration.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.Contractions

Suggestion: Use 'haven't' instead of 'have not'.

## Where docs live {#where-docs-live}

- **Repository:** [`ClickHouse/clickhouse-docs`](https://github.com/ClickHouse/clickhouse-docs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

open question -- do we want partner docs in our docs? I initially assumed we'd be pushing docs back to the partner and referencing them from our integration pages and drill downs. @Amehla let us know how you were thinking about it. Either would work, these are OSS. It definitely gives us more CH eyes on them on an opportunity to validate the changes as @mshustov suggests below.

- **Format:** Markdown, built with Docusaurus
- **Location:** `/docs/integrations/<category>/<your-integration>/`, where `<category>` reflects what your product does (`data-visualization`, `data-ingestion`, `language-clients`, and so on)

Check notice on line 21 in docs/integrations/integration-development/documenting-your-integration.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.Wordy

Suggestion: Remove 'and so on'. Try to use 'like' and provide examples instead.
- **Process:** open a pull request against `main`. The ClickHouse integrations team reviews. First-time contributors sign the Contributor License Agreement when the bot prompts on the PR

Integration pages in this repository are the primary reference for end users. You can link to supplementary documentation on your site from your integration page for product-specific details.

Good exemplars: [Tableau](https://github.com/ClickHouse/clickhouse-docs/blob/main/docs/integrations/data-visualization/tableau/tableau-and-clickhouse.md) and [Metabase](/integrations/metabase).

Check warning on line 26 in docs/integrations/integration-development/documenting-your-integration.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.Colons

': T' should be in lowercase.

## Choosing a category {#choosing-a-category}

Pick the category that best matches what your product does. Browse existing categories under [Integrations](/integrations) before you open a PR. If you are unsure, note your proposed category in the PR description and the integrations team will help place the page.

## Required sections {#required-sections}

Every integration page should cover the following, ideally in this order:

- **Purpose.** What problem the integration solves, in two or three sentences. Avoid marketing copy. Readers are usually engineers scoping a setup
- **Prerequisites and supported version matrix.** What the user needs installed and which versions you support for **both ClickHouse Cloud and self-hosted (open source)**. A small table works well
- **Setup walkthrough.** Step-by-step instructions to a working connection, with **side-by-side coverage of Cloud and self-hosted** where they differ (host, port, TLS)
- **Authentication.** Which auth modes you support (username and password over TLS at minimum, plus mTLS, SSL client cert, IP allow-list notes if relevant)
- **End-to-end example.** At least one realistic example from connection through a meaningful result. Use a [ClickHouse example dataset](/getting-started/example-datasets) so readers can reproduce it
- **Known limits and performance characteristics.** Type-system gaps, result-set thresholds, throughput notes, unsupported features. Honesty here saves support cycles
- **Troubleshooting.** Common errors and resolutions. Two or three frequent cases are enough for a first version

## Style notes {#style-notes}

- **Show both Cloud and self-hosted.** Cloud typically uses HTTPS on port `8443` and native TCP on `9440`. Self-hosted defaults to `8123` and `9000`
- **Use Docusaurus admonitions** (`:::note`, `:::warning`, `:::tip`) for callouts instead of bold paragraphs
- **Link out for depth.** Link to existing docs for data types, formats, JDBC, ClickPipes, and similar topics instead of re-explaining them
- **No marketing.** Integration pages here are technical reference. Promotional content belongs on your site; we can link to it from the partner directory

## Copy-paste skeleton {#copy-paste-skeleton}

Fill in the bracketed sections, save as `/docs/integrations/<category>/<your-integration>/index.md`, and open a PR.

```markdown
# [Your product] and ClickHouse

[One to three sentences: what the integration does and why a
ClickHouse user would want it.]

## Prerequisites

- [Your product, version X.Y or later]
- ClickHouse Cloud, or self-hosted ClickHouse version [X.Y] or later
- [Anything else: driver, plugin, network access requirements]

### Version matrix

| [Your product] | ClickHouse Cloud | ClickHouse open source | Notes |
| -------------- | ---------------- | ---------------------- | -------- |
| X.Y | ✅ | ✅ 24.x+ | [if any] |

## Setup

### Connect to ClickHouse Cloud

1. In the ClickHouse Cloud console, select your service and click **Connect**.
2. Choose **HTTPS**. Copy the host, port (8443), username, and password.
3. In [your product], [steps to configure the connection].

### Connect to self-hosted ClickHouse

1. [How to point at a self-hosted instance — host, port 8123 or 9000, TLS notes.]
2. In [your product], [steps to configure the connection].

## Authentication

[List supported auth modes — username/password over TLS, mTLS, etc. — and how
to configure each.]

## Example: querying the [dataset] dataset

[Walkthrough using one of the ClickHouse example datasets, end-to-end.]

## Known limits

- [Types not yet supported, e.g., deeply nested JSON]
- [Result-set size thresholds or other performance notes]
- [Feature gaps]

## Troubleshooting

### [Common error message]

[Cause and resolution.]

### [Another common error]

[Cause and resolution.]
```

## Review {#review}

The ClickHouse integrations team reviews PRs for technical accuracy, Cloud and self-hosted coverage, and docs style. Iterate in the PR until reviewers approve. That approval is the merge gate.
29 changes: 29 additions & 0 deletions docs/integrations/integration-development/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
---
slug: /integrations/integration-development
title: 'Integration development'
sidebar_label: 'Overview'
sidebar_position: 1
description: 'Guides for building, testing, and documenting ClickHouse integrations.'
keywords: ['integration development', 'build integration', 'partner', 'integration partner']
doc_type: 'landing-page'
---

# Integration development

These guides orient you if you build a product that connects to ClickHouse. They cover the integration surface, how to validate your connector, and how to publish documentation on this site.

:::note Partner portal
A dedicated [partner portal](https://clickhouse.com/partners) is launching soon. Until then, use these pages to get started. [Sign up](https://clickhouse.com/partners) when the portal is available to register your integration.
:::

## Guides {#guides}

Read them in this order:

| Guide | What it covers |
| ----- | -------------- |
| [Building integrations](/integrations/integration-development/building-integrations) | Ingestion and consumption paths, wire protocols, clients, and user-agent conventions |
| [Testing your integration](/integrations/integration-development/testing-your-integration) | Deployment modes, datasets, type coverage, and what to report before review |
| [Documenting your integration](/integrations/integration-development/documenting-your-integration) | Required doc sections, style rules, and a PR skeleton for your product page |

After you prototype and test, contribute your integration page under [`/docs/integrations/<category>/<your-integration>/`](/integrations/integration-development/documenting-your-integration) and open a pull request against [`clickhouse-docs`](https://github.com/ClickHouse/clickhouse-docs).
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
---
slug: /integrations/integration-development/testing-your-integration
sidebar_label: 'Testing your integration'
sidebar_position: 3
title: 'Testing your ClickHouse integration'
description: 'Entry-level validation matrix for integrations on ClickHouse Cloud and self-hosted open source.'
keywords: ['partner', 'integration', 'testing', 'validation', 'example datasets', 'ClickHouse Cloud', 'open source']
doc_type: 'guide'
---

# Testing your ClickHouse integration

Validate your integration against both ClickHouse deployment modes and datasets that exercise ClickHouse's type system at meaningful scale before you submit it for review. This page defines what "tested" means at the entry level. Formal validation is a separate process for partners progressing to higher partnership tiers.

See [Building integrations](/integrations/integration-development/building-integrations) for ingestion and consumption paths, and [Documenting your integration](/integrations/integration-development/documenting-your-integration) for how to publish your results.

## Test matrix {#test-matrix}

Cover both deployment modes. Most customers run one or the other, and behavior differs in places (auth, networking, available features).

- **ClickHouse Cloud:** sign up for a [free trial](https://clickhouse.com/cloud). No credit card is required for the development tier
- **Self-hosted (open source):** use the latest stable release from [GitHub releases](https://github.com/ClickHouse/ClickHouse/releases). The [install guide](/install) is the fastest path to a local instance with Docker

Test against both, and document any feature gaps in your integration page.

## What to test {#what-to-test}

**Functional correctness.** Exercise every code path your integration exposes: ingestion, querying, schema discovery, error handling, and reconnection. If your product surfaces SQL to end users, confirm that the queries your UI generates round-trip cleanly.

**Type-system coverage.** ClickHouse supports arrays, tuples, maps, JSON, nested, LowCardinality, Decimal, Date and DateTime variants, UUID, IPv4 and IPv6, enums, and aggregate-function types. Integrations often hit issues with nested arrays, deeply nested tuples, and JSON columns. Your client library and UI should handle these gracefully. At minimum, fail with a readable error instead of silently truncating or misrendering.

Check warning on line 30 in docs/integrations/integration-development/testing-your-integration.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.OxfordComma

Use a comma before the last 'and' or 'or' in a list of four or more items.

Check warning on line 30 in docs/integrations/integration-development/testing-your-integration.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.OxfordComma

Use a comma before the last 'and' or 'or' in a list of four or more items.

**Scale.** Test at result-set sizes and row counts your customers will run. For user-facing BI, that often means tables with hundreds of millions to billions of rows, and result sets from single aggregates to tens of thousands of rows. Unbounded reads (`SELECT *`) should fail predictably or paginate, not hang.

**Authentication.** Validate at least one TLS-enabled connection. If you expose auth configuration, test every mode you document (username and password over TLS, mTLS, SSL client certificate).

**Connection lifecycle.** Confirm sane behavior on dropped connections, server restarts, and slow queries. Many escalations trace back to connection handling rather than query semantics.

## Recommended example datasets {#recommended-example-datasets}

The full set is at [Example datasets](/getting-started/example-datasets). These four cover most integration testing needs:

- **[GitHub events](/getting-started/example-datasets/github-events):** 3.1B rows with nested event payloads. Best for arrays, tuples, and nested types

Check warning on line 42 in docs/integrations/integration-development/testing-your-integration.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.Units

Add a space between the number and the unit in '1B'.
- **[NYC taxi data](/getting-started/example-datasets/nyc-taxi):** billions of rows with a well-known schema. Good for throughput and read-path testing

Check notice on line 43 in docs/integrations/integration-development/testing-your-integration.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.Uppercase

Suggestion: Instead of uppercase for 'NYC', use lowercase or backticks (`) if possible. Otherwise, ask a Technical Writer to add this word or acronym to the rule's exception list.
- **[Stack Overflow](/getting-started/example-datasets/stackoverflow):** multi-table relational data for JOIN-heavy BI scenarios

Check notice on line 44 in docs/integrations/integration-development/testing-your-integration.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.Uppercase

Suggestion: Instead of uppercase for 'JOIN', use lowercase or backticks (`) if possible. Otherwise, ask a Technical Writer to add this word or acronym to the rule's exception list.
- **[Hacker News](/getting-started/example-datasets/hacker-news):** 28M rows, fast to load, useful for iteration

For extreme-scale validation, use **[WikiStat](/getting-started/example-datasets/wikistat)** (~0.5 trillion records).

## What to capture from your testing {#what-to-capture-from-your-testing}

When you submit your integration for review, share:

- ClickHouse versions tested (Cloud and open source)
- Datasets and approximate scale (rows, on-disk size)
- Types your integration handles and types it does not (this becomes the **Known limits** section of your docs)

Check notice on line 55 in docs/integrations/integration-development/testing-your-integration.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.Contractions

Suggestion: Use 'doesn't' instead of 'does not'.
- Performance characteristics worth flagging, such as result-set thresholds where behavior changes

A short test report saves review cycles. A paragraph plus a table is enough.
12 changes: 12 additions & 0 deletions sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -1249,6 +1249,18 @@ const sidebars = {
},
],
},
{
type: 'category',
label: 'Integration development',
collapsed: true,
collapsible: true,
link: { type: 'doc', id: 'integrations/integration-development/index' },
items: [
'integrations/integration-development/building-integrations',
'integrations/integration-development/testing-your-integration',
'integrations/integration-development/documenting-your-integration',
],
},
],

managingData: [
Expand Down
Loading