-
Notifications
You must be signed in to change notification settings - Fork 471
partner integration docs v0 #6262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| position: 500 | ||
| label: 'Integration development' | ||
| collapsible: true | ||
| collapsed: true | ||
| link: | ||
| type: doc | ||
| id: integrations/integration-development/index |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,85 @@ | ||
| --- | ||
| slug: /integrations/integration-development/building-integrations | ||
| title: 'Building integrations with ClickHouse' | ||
| sidebar_label: 'Building integrations' | ||
| sidebar_position: 2 | ||
| description: 'Orientation on ingestion, consumption, wire protocols, and client conventions for ClickHouse integrations.' | ||
| keywords: ['partner', 'integration', 'ingestion', 'consumption', 'ClickPipes', 'language clients', 'user-agent'] | ||
| doc_type: 'guide' | ||
| --- | ||
|
|
||
| # Building integrations with ClickHouse | ||
|
|
||
| This page orients you to the integration surface so you can scope ingestion and consumption work. For validation and publishing, continue with [Testing your integration](/integrations/integration-development/testing-your-integration) and [Documenting your integration](/integrations/integration-development/documenting-your-integration). | ||
|
|
||
| ## Ingestion {#ingestion} | ||
|
|
||
| Two paths bring data into ClickHouse. Choose based on whether your product should own the ingestion plane or delegate it. | ||
|
|
||
| ### Path A: ClickPipes (managed, ClickHouse Cloud only) {#path-a-clickpipes} | ||
|
|
||
| If you prefer not to build and operate ingestion infrastructure, [ClickPipes](/integrations/clickpipes) is the managed service that pulls from your customer's sources into their ClickHouse Cloud service. ClickPipes handles scaling, parallelization, retries, and lag reporting. | ||
|
|
||
| Supported sources today include: | ||
|
|
||
| - **Streaming:** Apache Kafka (including MSK, Confluent Cloud, Redpanda, Azure Event Hubs, WarpStream), Amazon Kinesis | ||
|
Check notice on line 25 in docs/integrations/integration-development/building-integrations.md
|
||
| - **Object storage:** Amazon S3 (and S3-compatible stores), Google Cloud Storage, Azure Blob Storage | ||
| - **CDC:** PostgreSQL, MySQL, MongoDB, BigQuery | ||
|
|
||
| ### Path B: Self-driven ingestion via an official language client {#path-b-language-client} | ||
|
|
||
| If you own the pipeline, use one of the [official language clients](/integrations/language-clients). They handle serialization, batching, TLS, compression, and connection pooling. You pass runtime primitives; the client handles the wire format. | ||
|
|
||
| - Official clients: Python, Go, Java, JavaScript, Rust, C#, C++ | ||
| - Both wire protocols: HTTP and native TCP (Go and C++) | ||
| - Auth: username and password over TLS by default; mTLS and SSL client-certificate auth are supported by all major clients | ||
| - Data format is usually an implementation detail. Clients convert runtime types to ClickHouse Native or RowBinary format. If you already produce Arrow, Parquet, JSONEachRow, or another format, most clients expose a raw-bytes API for pre-serialized data | ||
| - For throughput, batch **10K–100K rows** and aim for roughly **one insert per second** as an upper bound for synchronous inserts. If client-side batching is impractical, use [asynchronous inserts](/optimize/asynchronous-inserts) to shift batching to the server | ||
|
|
||
| See also: [Bulk inserts](/optimize/bulk-inserts). | ||
|
|
||
| ## Consumption {#consumption} | ||
|
|
||
| HTTP and native TCP both carry queries. Native is binary and lower overhead. HTTP works through load balancers and proxies. Both are first-class; pick based on infrastructure, not feature gaps. | ||
|
|
||
| - **Application code:** use the same [official language clients](/integrations/language-clients) as for ingestion | ||
| - **BI and SQL tools:** ClickHouse ships an official [JDBC v2 driver](/integrations/java) (Java) and an [ODBC driver](/interfaces/odbc). Tableau, Looker, Power BI, Metabase, Apache Superset, and Grafana integrate via these drivers or dedicated connectors maintained by ClickHouse and partners | ||
| - **Result format:** clients typically own serialization. You can request Arrow, Parquet, or other columnar formats on the wire if your product needs them | ||
|
|
||
| ### Result-set sizing {#result-set-sizing} | ||
|
|
||
| Most analytical queries return small result sets (aggregates, summaries, top-N), and the wire is rarely the bottleneck. ClickHouse tables can hold billions of rows, and an unbounded `SELECT *` over a large fact table can move terabytes. **Shape the request in your application:** use `LIMIT`, pagination, streaming reads, and explicit column lists. If you build user-facing analytics, treat unbounded result sets as a UX problem, not a transport problem. | ||
|
|
||
| ClickHouse has a rich type system: arrays, tuples, maps, JSON, nested, LowCardinality, and more. Official clients map these to idiomatic language types. If your product surfaces ClickHouse data to end users, plan a type-mapping strategy early. | ||
|
|
||
| ## Next steps {#next-steps} | ||
|
|
||
| Pick a path and prototype against a [ClickHouse Cloud trial](https://clickhouse.com/cloud). When the `partner portal` is available, register your integration there. | ||
|
|
||
| ## User-agent string convention {#user-agent-string-convention} | ||
|
|
||
| HTTP clients should set a `User-Agent` string that identifies your integration. ClickHouse parses this server-side to track adoption, surface usage telemetry, and inform the roadmap. | ||
|
|
||
| Format: | ||
|
|
||
| ```text | ||
|
Check notice on line 65 in docs/integrations/integration-development/building-integrations.md
|
||
| <app_name>/<app_version> <client_name>/<client_version> (<comment>; <key1>: <value1>; <key2>: <value2>) | ||
| ``` | ||
|
|
||
| Examples: | ||
|
|
||
| - `clickhouse-java/0.8.0` | ||
| - `my-analytics-app/3.1.2 clickhouse-js/1.2.0 (env: staging; region: us-east-1; lv: node/20.10)` | ||
|
|
||
| Rules: | ||
|
|
||
| - No whitespace in client name or version | ||
| - If you include a comment, it must come first | ||
| - Standard metadata keys: `lv` (language or framework version), `os`, `arch` | ||
| - TCP and native protocol clients report client name and version via protocol fields, not `User-Agent` | ||
|
|
||
| If you use JDBC, see [client identification](/integrations/language-clients/java/jdbc#client-identification) for how the driver sets `User-Agent` and related fields. | ||
|
|
||
| ## Sandbox and trial access {#sandbox-and-trial-access} | ||
|
|
||
| [ClickHouse Cloud](https://clickhouse.com/cloud) offers a free trial for development and integration validation. If you are a House Mate partner, you can request additional development credits through the [partner portal](https://clickhouse.com/partners). | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,114 @@ | ||
| --- | ||
| slug: /integrations/integration-development/documenting-your-integration | ||
| sidebar_label: 'Documenting your integration' | ||
| sidebar_position: 4 | ||
| title: 'Documenting your ClickHouse integration' | ||
| description: 'How to contribute integration pages to clickhouse-docs, including required sections and a copy-paste skeleton.' | ||
| keywords: ['partner', 'integration', 'documentation', 'contributing', 'pull request', 'integration docs'] | ||
| doc_type: 'guide' | ||
| --- | ||
|
|
||
| # Documenting your ClickHouse integration | ||
|
|
||
| Integration documentation on this site gives end users one place to scope and troubleshoot setups. This page describes what to include, where files go, and how to open a pull request. | ||
|
|
||
| Start with [Building integrations](/integrations/integration-development/building-integrations) and [Testing your integration](/integrations/integration-development/testing-your-integration) if you have not already. | ||
|
|
||
| ## Where docs live {#where-docs-live} | ||
|
|
||
| - **Repository:** [`ClickHouse/clickhouse-docs`](https://github.com/ClickHouse/clickhouse-docs) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. open question -- do we want partner docs in our docs? I initially assumed we'd be pushing docs back to the partner and referencing them from our integration pages and drill downs. @Amehla let us know how you were thinking about it. Either would work, these are OSS. It definitely gives us more CH eyes on them on an opportunity to validate the changes as @mshustov suggests below. |
||
| - **Format:** Markdown, built with Docusaurus | ||
| - **Location:** `/docs/integrations/<category>/<your-integration>/`, where `<category>` reflects what your product does (`data-visualization`, `data-ingestion`, `language-clients`, and so on) | ||
| - **Process:** open a pull request against `main`. The ClickHouse integrations team reviews. First-time contributors sign the Contributor License Agreement when the bot prompts on the PR | ||
|
|
||
| Integration pages in this repository are the primary reference for end users. You can link to supplementary documentation on your site from your integration page for product-specific details. | ||
|
|
||
| Good exemplars: [Tableau](https://github.com/ClickHouse/clickhouse-docs/blob/main/docs/integrations/data-visualization/tableau/tableau-and-clickhouse.md) and [Metabase](/integrations/metabase). | ||
|
|
||
| ## Choosing a category {#choosing-a-category} | ||
|
|
||
| Pick the category that best matches what your product does. Browse existing categories under [Integrations](/integrations) before you open a PR. If you are unsure, note your proposed category in the PR description and the integrations team will help place the page. | ||
|
|
||
| ## Required sections {#required-sections} | ||
|
|
||
| Every integration page should cover the following, ideally in this order: | ||
|
|
||
| - **Purpose.** What problem the integration solves, in two or three sentences. Avoid marketing copy. Readers are usually engineers scoping a setup | ||
| - **Prerequisites and supported version matrix.** What the user needs installed and which versions you support for **both ClickHouse Cloud and self-hosted (open source)**. A small table works well | ||
| - **Setup walkthrough.** Step-by-step instructions to a working connection, with **side-by-side coverage of Cloud and self-hosted** where they differ (host, port, TLS) | ||
| - **Authentication.** Which auth modes you support (username and password over TLS at minimum, plus mTLS, SSL client cert, IP allow-list notes if relevant) | ||
| - **End-to-end example.** At least one realistic example from connection through a meaningful result. Use a [ClickHouse example dataset](/getting-started/example-datasets) so readers can reproduce it | ||
| - **Known limits and performance characteristics.** Type-system gaps, result-set thresholds, throughput notes, unsupported features. Honesty here saves support cycles | ||
| - **Troubleshooting.** Common errors and resolutions. Two or three frequent cases are enough for a first version | ||
|
|
||
| ## Style notes {#style-notes} | ||
|
|
||
| - **Show both Cloud and self-hosted.** Cloud typically uses HTTPS on port `8443` and native TCP on `9440`. Self-hosted defaults to `8123` and `9000` | ||
| - **Use Docusaurus admonitions** (`:::note`, `:::warning`, `:::tip`) for callouts instead of bold paragraphs | ||
| - **Link out for depth.** Link to existing docs for data types, formats, JDBC, ClickPipes, and similar topics instead of re-explaining them | ||
| - **No marketing.** Integration pages here are technical reference. Promotional content belongs on your site; we can link to it from the partner directory | ||
|
|
||
| ## Copy-paste skeleton {#copy-paste-skeleton} | ||
|
|
||
| Fill in the bracketed sections, save as `/docs/integrations/<category>/<your-integration>/index.md`, and open a PR. | ||
|
|
||
| ```markdown | ||
| # [Your product] and ClickHouse | ||
|
|
||
| [One to three sentences: what the integration does and why a | ||
| ClickHouse user would want it.] | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| - [Your product, version X.Y or later] | ||
| - ClickHouse Cloud, or self-hosted ClickHouse version [X.Y] or later | ||
| - [Anything else: driver, plugin, network access requirements] | ||
|
|
||
| ### Version matrix | ||
|
|
||
| | [Your product] | ClickHouse Cloud | ClickHouse open source | Notes | | ||
| | -------------- | ---------------- | ---------------------- | -------- | | ||
| | X.Y | ✅ | ✅ 24.x+ | [if any] | | ||
|
|
||
| ## Setup | ||
|
|
||
| ### Connect to ClickHouse Cloud | ||
|
|
||
| 1. In the ClickHouse Cloud console, select your service and click **Connect**. | ||
| 2. Choose **HTTPS**. Copy the host, port (8443), username, and password. | ||
| 3. In [your product], [steps to configure the connection]. | ||
|
|
||
| ### Connect to self-hosted ClickHouse | ||
|
|
||
| 1. [How to point at a self-hosted instance — host, port 8123 or 9000, TLS notes.] | ||
| 2. In [your product], [steps to configure the connection]. | ||
|
|
||
| ## Authentication | ||
|
|
||
| [List supported auth modes — username/password over TLS, mTLS, etc. — and how | ||
| to configure each.] | ||
|
|
||
| ## Example: querying the [dataset] dataset | ||
|
|
||
| [Walkthrough using one of the ClickHouse example datasets, end-to-end.] | ||
|
|
||
| ## Known limits | ||
|
|
||
| - [Types not yet supported, e.g., deeply nested JSON] | ||
| - [Result-set size thresholds or other performance notes] | ||
| - [Feature gaps] | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| ### [Common error message] | ||
|
|
||
| [Cause and resolution.] | ||
|
|
||
| ### [Another common error] | ||
|
|
||
| [Cause and resolution.] | ||
| ``` | ||
|
|
||
| ## Review {#review} | ||
|
|
||
| The ClickHouse integrations team reviews PRs for technical accuracy, Cloud and self-hosted coverage, and docs style. Iterate in the PR until reviewers approve. That approval is the merge gate. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| --- | ||
| slug: /integrations/integration-development | ||
| title: 'Integration development' | ||
| sidebar_label: 'Overview' | ||
| sidebar_position: 1 | ||
| description: 'Guides for building, testing, and documenting ClickHouse integrations.' | ||
| keywords: ['integration development', 'build integration', 'partner', 'integration partner'] | ||
| doc_type: 'landing-page' | ||
| --- | ||
|
|
||
| # Integration development | ||
|
|
||
| These guides orient you if you build a product that connects to ClickHouse. They cover the integration surface, how to validate your connector, and how to publish documentation on this site. | ||
|
|
||
| :::note Partner portal | ||
| A dedicated [partner portal](https://clickhouse.com/partners) is launching soon. Until then, use these pages to get started. [Sign up](https://clickhouse.com/partners) when the portal is available to register your integration. | ||
| ::: | ||
|
|
||
| ## Guides {#guides} | ||
|
|
||
| Read them in this order: | ||
|
|
||
| | Guide | What it covers | | ||
| | ----- | -------------- | | ||
| | [Building integrations](/integrations/integration-development/building-integrations) | Ingestion and consumption paths, wire protocols, clients, and user-agent conventions | | ||
| | [Testing your integration](/integrations/integration-development/testing-your-integration) | Deployment modes, datasets, type coverage, and what to report before review | | ||
| | [Documenting your integration](/integrations/integration-development/documenting-your-integration) | Required doc sections, style rules, and a PR skeleton for your product page | | ||
|
|
||
| After you prototype and test, contribute your integration page under [`/docs/integrations/<category>/<your-integration>/`](/integrations/integration-development/documenting-your-integration) and open a pull request against [`clickhouse-docs`](https://github.com/ClickHouse/clickhouse-docs). |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,58 @@ | ||
| --- | ||
| slug: /integrations/integration-development/testing-your-integration | ||
| sidebar_label: 'Testing your integration' | ||
| sidebar_position: 3 | ||
| title: 'Testing your ClickHouse integration' | ||
| description: 'Entry-level validation matrix for integrations on ClickHouse Cloud and self-hosted open source.' | ||
| keywords: ['partner', 'integration', 'testing', 'validation', 'example datasets', 'ClickHouse Cloud', 'open source'] | ||
| doc_type: 'guide' | ||
| --- | ||
|
|
||
| # Testing your ClickHouse integration | ||
|
|
||
| Validate your integration against both ClickHouse deployment modes and datasets that exercise ClickHouse's type system at meaningful scale before you submit it for review. This page defines what "tested" means at the entry level. Formal validation is a separate process for partners progressing to higher partnership tiers. | ||
|
|
||
| See [Building integrations](/integrations/integration-development/building-integrations) for ingestion and consumption paths, and [Documenting your integration](/integrations/integration-development/documenting-your-integration) for how to publish your results. | ||
|
|
||
| ## Test matrix {#test-matrix} | ||
|
|
||
| Cover both deployment modes. Most customers run one or the other, and behavior differs in places (auth, networking, available features). | ||
|
|
||
| - **ClickHouse Cloud:** sign up for a [free trial](https://clickhouse.com/cloud). No credit card is required for the development tier | ||
| - **Self-hosted (open source):** use the latest stable release from [GitHub releases](https://github.com/ClickHouse/ClickHouse/releases). The [install guide](/install) is the fastest path to a local instance with Docker | ||
|
|
||
| Test against both, and document any feature gaps in your integration page. | ||
|
|
||
| ## What to test {#what-to-test} | ||
|
|
||
| **Functional correctness.** Exercise every code path your integration exposes: ingestion, querying, schema discovery, error handling, and reconnection. If your product surfaces SQL to end users, confirm that the queries your UI generates round-trip cleanly. | ||
|
|
||
| **Type-system coverage.** ClickHouse supports arrays, tuples, maps, JSON, nested, LowCardinality, Decimal, Date and DateTime variants, UUID, IPv4 and IPv6, enums, and aggregate-function types. Integrations often hit issues with nested arrays, deeply nested tuples, and JSON columns. Your client library and UI should handle these gracefully. At minimum, fail with a readable error instead of silently truncating or misrendering. | ||
|
Check warning on line 30 in docs/integrations/integration-development/testing-your-integration.md
|
||
|
|
||
| **Scale.** Test at result-set sizes and row counts your customers will run. For user-facing BI, that often means tables with hundreds of millions to billions of rows, and result sets from single aggregates to tens of thousands of rows. Unbounded reads (`SELECT *`) should fail predictably or paginate, not hang. | ||
|
|
||
| **Authentication.** Validate at least one TLS-enabled connection. If you expose auth configuration, test every mode you document (username and password over TLS, mTLS, SSL client certificate). | ||
|
|
||
| **Connection lifecycle.** Confirm sane behavior on dropped connections, server restarts, and slow queries. Many escalations trace back to connection handling rather than query semantics. | ||
|
|
||
| ## Recommended example datasets {#recommended-example-datasets} | ||
|
|
||
| The full set is at [Example datasets](/getting-started/example-datasets). These four cover most integration testing needs: | ||
|
|
||
| - **[GitHub events](/getting-started/example-datasets/github-events):** 3.1B rows with nested event payloads. Best for arrays, tuples, and nested types | ||
| - **[NYC taxi data](/getting-started/example-datasets/nyc-taxi):** billions of rows with a well-known schema. Good for throughput and read-path testing | ||
|
Check notice on line 43 in docs/integrations/integration-development/testing-your-integration.md
|
||
| - **[Stack Overflow](/getting-started/example-datasets/stackoverflow):** multi-table relational data for JOIN-heavy BI scenarios | ||
|
Check notice on line 44 in docs/integrations/integration-development/testing-your-integration.md
|
||
| - **[Hacker News](/getting-started/example-datasets/hacker-news):** 28M rows, fast to load, useful for iteration | ||
|
|
||
| For extreme-scale validation, use **[WikiStat](/getting-started/example-datasets/wikistat)** (~0.5 trillion records). | ||
|
|
||
| ## What to capture from your testing {#what-to-capture-from-your-testing} | ||
|
|
||
| When you submit your integration for review, share: | ||
|
|
||
| - ClickHouse versions tested (Cloud and open source) | ||
| - Datasets and approximate scale (rows, on-disk size) | ||
| - Types your integration handles and types it does not (this becomes the **Known limits** section of your docs) | ||
| - Performance characteristics worth flagging, such as result-set thresholds where behavior changes | ||
|
|
||
| A short test report saves review cycles. A paragraph plus a table is enough. | ||
Uh oh!
There was an error while loading. Please reload this page.