diff --git a/docs/alert.md b/docs/alert.md index c6541753b..01dc94c0f 100644 --- a/docs/alert.md +++ b/docs/alert.md @@ -2,7 +2,13 @@ Timeplus provides out-of-box charts and dashboards. You can also create [sinks](/destination) to send downsampled data to Kafka or other message buses, or notify others via email/slack. You can even send new messages to Kafka, then consume such messages timely in the downstream system. This could be a solution for alerting and automation. -Since it's a common use case to define and manage alerts, Timeplus started supporting alerts out-of-box. +Since it's a common use case to define and manage alerts, Timeplus supports alerting out-of-box.\ + +:::warning +Starting from Timeplus Enterprise v2.9, the alerting feature will be provided by the core SQL engine, with increased performance and stability, as well as SQL based manageability. + +The previous alerting feature will be deprecated in the future releases. +::: ## Create New Alert Rule diff --git a/docs/append-stream.md b/docs/append-stream.md new file mode 100644 index 000000000..04588e6f3 --- /dev/null +++ b/docs/append-stream.md @@ -0,0 +1,7 @@ +# Append Stream + +By default, the streams in Timeplus are Append Streams: +* They are designed to handle a continuous flow of data, where new events are added to the end of the stream. +* The data is saved in columnar storage, optimized for high throughput and low latency read and write. +* Older data can be purged automatically by setting a retention policy, which helps manage storage costs and keeps the stream size manageable. +* Limited capabilities to update or delete existing data, as the primary focus is on appending new data. diff --git a/docs/proton-architecture.md b/docs/architecture.md similarity index 90% rename from docs/proton-architecture.md rename to docs/architecture.md index d19809991..fb2cf6928 100644 --- a/docs/proton-architecture.md +++ b/docs/architecture.md @@ -2,15 +2,15 @@ ## High Level Architecture -The following diagram depicts the high level architecture of Proton. +The following diagram depicts the high level architecture of Timeplus SQL engine, starting from a single node deployment. -![Proton Architecture](/img/proton-high-level-arch.gif) +![Architecture](/img/proton-high-level-arch.gif) All of the components / functionalities are built into one single binary. ## Data Storage -Users can create a stream by using `CREATE STREAM ...` [DDL SQL](/proton-create-stream). Every stream has 2 parts at storage layer by default: +Users can create a stream by using `CREATE STREAM ...` [DDL SQL](/sql-create-stream). Every stream has 2 parts at storage layer by default: 1. the real-time streaming data part, backed by Timeplus NativeLog 2. the historical data part, backed by ClickHouse historical data store. diff --git a/docs/singlenode_install.md b/docs/bare-metal-install.md similarity index 61% rename from docs/singlenode_install.md rename to docs/bare-metal-install.md index 30c40978f..1ffc262ee 100644 --- a/docs/singlenode_install.md +++ b/docs/bare-metal-install.md @@ -1,12 +1,15 @@ -# Single Node Install +# Deploy on Bare Metal -Timeplus Enterprise can be easily installed on a single node, with or without Docker. +Timeplus Enterprise can be easily installed bare metal Linux or MacOS, as a single node or multi-node clsuter. -## Bare Metal Install{#bare_metal} +## Single Node Install ### Install Script -If your server or computer is running Linux or MacOS, you can run the following command to download the package and start Timeplus Enterprise without any other dependencies. For Windows users, please follow [our guide](#docker) for running Timeplus Enterprise with Docker. +If your server or computer is running Linux or MacOS, you can run the following command to download the package and start Timeplus Enterprise without any other dependencies. +:::info +For Windows users, please follow [our guide](#docker) for running Timeplus Enterprise with Docker. +::: ```shell curl https://install.timeplus.com | sh @@ -17,8 +20,9 @@ This script will download the latest release (based on your operating system and If you'd like to download the package for a certain feature release, you can run the following command: ```shell -curl https://install.timeplus.com/2.7 | sh +curl https://install.timeplus.com/2.8 | sh ``` +Replace `2.8` with the desired version number. ### Manual Install You can also download packages manually with the following links: @@ -50,6 +54,12 @@ It is also possible to only start/stop single process by running `timeplus start For more information, please check the [CLI Reference](/cli-reference). +### Upgrade {#upgrade} +To upgrade Timeplus Enterprise, run `timeplus stop` to stop all the services. Then replace all the binaries to the higher version of Timeplus Enterprise release and then run `timeplus start`. + +### Uninstall {#uninstall} +Timeplus Enterprise has no external dependencies. Just run `timeplus stop` then delete the timeplus folder. + ## Docker Install{#docker} Alternatively, run the following command to start Timeplus Enterprise with [Docker](https://www.docker.com/get-started/): @@ -79,13 +89,46 @@ This stack demonstrates how to run streaming ETL, getting data from Kafka API, a * [Tutorial – Streaming ETL: Kafka to Kafka](/tutorial-sql-etl) * [Tutorial – Streaming ETL: Kafka to ClickHouse](/tutorial-sql-etl-kafka-to-ch) +## Cluster Install +Timeplus Enterprise can be installed in multi-node cluster mode for high availability and horizontal scalability. This page shares how to install Timeplus cluster on bare metal. Pleae refer to [this guide](/k8s-helm) to deploy Timeplus Enterprise on Kubernetes. + +### Multi-Node Cluster + +There are multiple ways to setup a cluster without Kubernetes. One easy solution is to run all components in one node, and the rest of nodes running the timeplusd only. For other deployment options, please contact [support](mailto:support@timeplus.com) or message us in our [Slack Community](https://timeplus.com/slack). + +Choose one node as the lead node, say its hostname is `timeplus-server1`. Stop all services via `timeplus stop` command. Then configure environment variables. +```bash +export ADVERTISED_HOST=timeplus-server1 +export METADATA_NODE_QUORUM=timeplus-server1:8464,timeplus-server2:8464,timeplus-server3:8464 +export TIMEPLUSD_REPLICAS=3 +``` +Then run `timeplus start` to start all services, including timeplusd, timeplus_web, timeplus_appserver and timeplus_connector. + +On the second node, first make sure all services are stopped via `timeplus stop`. +Then configure environment variables. +```bash +export ADVERTISED_HOST=timeplus-server2 +export METADATA_NODE_QUORUM=timeplus-server1:8464,timeplus-server2:8464,timeplus-server3:8464 +``` +Then run `timeplus start -s timeplusd` to only start timeplusd services. + +Similarly on the third node, set `export ADVERTISED_HOST=timeplus-server3` and the same `METADATA_NODE_QUORUM` and only start timeplusd. + +### Single-Host Cluster {#single-host-cluster} + +Starting from [Timeplus Enterprise v2.7](/enterprise-v2.7), you can also easily setup multiple timeplusd processes on the same host by running the `timeplusd server` with `node-index` option. This is useful for testing multi-node cluster. +```bash +./timeplusd server --node-index=1 +./timeplusd server --node-index=2 +./timeplusd server --node-index=3 +``` + +Timeplusd will automatically bind to different ports for each node. You can run `timeplusd client` to connect to one node and check the status of the cluster via: +```sql +SELECT * FROM system.cluster +``` + ## License Management{#license} When you start Timeplus Enterprise and access the web console for the first time, the 30-day free trial starts. When it ends, the software stops working. Please check [the guide](/server_config#license) to update licenses. - -## Upgrade {#upgrade} -To upgrade Timeplus Enterprise, run `timeplus stop` to stop all the services. Then replace all the binaries to the higher version of Timeplus Enterprise release and then run `timeplus start`. - -## Uninstall {#uninstall} -Timeplus Enterprise has no external dependencies. Just run `timeplus stop` then delete the timeplus folder. diff --git a/docs/changelog-stream.md b/docs/changelog-stream.md index 66654efe9..c42416baa 100644 --- a/docs/changelog-stream.md +++ b/docs/changelog-stream.md @@ -1,4 +1,4 @@ -# Changelog Stream +# Changelog Key Value Stream When you create a stream with the mode `changelog_kv`, the data in the stream is no longer append-only. When you query the stream directly, only the latest version for the same primary key(s) will be shown. Data can be updated or deleted. You can use Changelog Stream in JOIN either on the left or on the right. Timeplus will automatically choose the latest version. @@ -17,7 +17,7 @@ In this example, you create a stream `dim_products` in `changelog_kv` mode with :::info -The rest of this page assumes you are using Timeplus Console. If you are using Proton, you can create the stream with DDL. [Learn more](/proton-create-stream#changelog-stream) +The rest of this page assumes you are using Timeplus Console. If you are using Proton, you can create the stream with DDL. ::: @@ -403,7 +403,7 @@ Debezium also read all existing rows and generate messages like this ### Load data to Timeplus -You can follow this [guide](/kafka-source) to add 2 data sources to load data from Kafka or Redpanda. For example: +You can follow this [guide](/proton-kafka) to add 2 external streams to load data from Kafka or Redpanda. For example: * Data source name `s1` to load data from topic `doc.public.dim_products` and put in a new stream `rawcdc_dim_products` * Data source name `s2` to load data from topic `doc.public.orders` and put in a new stream `rawcdc_orders` diff --git a/docs/cluster_install.md b/docs/cluster_install.md deleted file mode 100644 index 22b28ebe5..000000000 --- a/docs/cluster_install.md +++ /dev/null @@ -1,61 +0,0 @@ -# Cluster Install -Timeplus Enterprise can be installed in multi-node cluster mode for high availability and horizontal scalability. - -Both bare metal and Kubernetes are supported. - -## Bare Metal Install - -### Single Node -Follow the guide in [Single Node Install](/singlenode_install) to grab the bare metal package and install on each node. - -### Multi-Node Cluster - -There are multiple ways to setup a cluster without Kubernetes. One easy solution is to run all components in one node, and the rest of nodes running the timeplusd only. For other deployment options, please contact [support](mailto:support@timeplus.com) or message us in our [Slack Community](https://timeplus.com/slack). - -Choose one node as the lead node, say its hostname is `timeplus-server1`. Stop all services via `timeplus stop` command. Then configure environment variables. -```bash -export ADVERTISED_HOST=timeplus-server1 -export METADATA_NODE_QUORUM=timeplus-server1:8464,timeplus-server2:8464,timeplus-server3:8464 -export TIMEPLUSD_REPLICAS=3 -``` -Then run `timeplus start` to start all services, including timeplusd, timeplus_web, timeplus_appserver and timeplus_connector. - -On the second node, first make sure all services are stopped via `timeplus stop`. -Then configure environment variables. -```bash -export ADVERTISED_HOST=timeplus-server2 -export METADATA_NODE_QUORUM=timeplus-server1:8464,timeplus-server2:8464,timeplus-server3:8464 -``` -Then run `timeplus start -s timeplusd` to only start timeplusd services. - -Similarly on the third node, set `export ADVERTISED_HOST=timeplus-server3` and the same `METADATA_NODE_QUORUM` and only start timeplusd. - -### Single-Host Cluster {#single-host-cluster} - -Starting from [Timeplus Enterprise v2.7](/enterprise-v2.7), you can also easily setup multiple timeplusd processes on the same host by running the `timeplusd server` with `node-index` option. This is useful for testing multi-node cluster. -```bash -./timeplusd server --node-index=1 -./timeplusd server --node-index=2 -./timeplusd server --node-index=3 -``` - -Timeplusd will automatically bind to different ports for each node. You can run `timeplusd client` to connect to one node and check the status of the cluster via: -```sql -SELECT * FROM system.cluster -``` - -## Kubernetes Install {#k8s} - -You can also deploy Timeplus Enterprise on a Kubernetes cluster with [Helm](https://helm.sh/). - -### Prerequisites -* Ensure you have Helm 3.12 + installed in your environment. For details about how to install Helm, see the [Helm documentation](https://helm.sh/docs/intro/install/) -* Ensure you have [Kubernetes](https://kubernetes.io/) 1.25 or higher installed in your environment -* Ensure you have allocated enough resources for the deployment - -### Deploy Timeplus Enterprise with Helm - -Follow the [guide](/k8s-helm) to deploy Timeplus Enterprise on Kubernetes with Helm. - -## License Management -To activate or add new a license, please follow [our guide](/server_config#license). diff --git a/docs/compare.md b/docs/compare.md new file mode 100644 index 000000000..df8de5d95 --- /dev/null +++ b/docs/compare.md @@ -0,0 +1,15 @@ +# Timeplus Proton vs. Timeplus Enterprise + +Timeplus Proton powers unified streaming and data processing on a single database node. Its commercial counterpart, Timeplus Enterprise, supports advanced deployment strategy and includes enterprise-ready features. + +| Features | **Timeplus Proton** | **Timeplus Enterprise** | +| ----------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| **Deployment** | | | +| **Data Processing** | | | +| **External Systems** | | | +| **Web Console** | | | +| **Support** | | | + +These details are subject to change, but we'll do our best to make sure they accurately represent the latest roadmaps for Timeplus Proton and Timeplus Enterprise. + +[Contact us](mailto:info@timeplus.com) for more details or schedule a demo. diff --git a/docs/confluent-cloud-source.md b/docs/confluent-cloud-source.md deleted file mode 100644 index e4258c0cd..000000000 --- a/docs/confluent-cloud-source.md +++ /dev/null @@ -1,21 +0,0 @@ -# Load streaming data from Confluent Cloud - -We are pleased to partner with [Confluent Cloud](https://www.confluent.io/confluent-cloud/?utm_campaign=tm.pmm_cd.2023_partner_cwc_timeplus_generic&utm_source=timeplus&utm_medium=partnerref), allowing you to easily connect your streaming data via [external streams](/external-stream) without moving data. - -## Video Tutorial - -We recorded a video to explain the details. - - - -## Detailed Steps - -1. From the left side navigation menu, click **Data Collection**. Here, you’ll see ways to connect a source or external stream. Click **Confluent Cloud** (external stream). -2. Enter the bootstrap UR for your Confluent Cloud cluster, and set the Kafka API key and secret. Click **Next**. -3. Enter the name of the Kafka topic, and specify the ‘read as’ data format. We currently support JSON, Avro, Protobuf, Text and other formats. - 1. If the data in the Kafka topic is in JSON format, but the schema may change over time, we recommend you choose Text. This way, the entire JSON document will be saved as a string, and you can apply JSON related functions to extract value, even if the schema changes. - 2. If you choose Avro, there is an option for 'Auto Extraction'. By default, this is toggled off, meaning the entire message will be saved as a string. If you toggle it on, then the top level attribute in the AVRO message will be put into different columns. This would be more convenient for you to query, but won't support schema evolution. When Avro is selected, you also need to specify the address, API key, and secret key for the schema registry. - 3. If you choose Protobuf, please paste the entire Protobuf definition in, and specify the root message name. -4. In the next “Preview” step, we will show you at least one event from your specified Confluent Cloud source. -5. By default, your new source will create a new stream in Timeplus. Give this new stream a name and verify the columns information (column name and data type). You can also set a column as the event time column. If you don’t, we will use the ingestion time as the event time. Alternatively, you can select an existing stream from the dropdown. -6. After previewing your data, you can give the source a name and an optional description, and review the configuration. Once you click Finish, your streaming data will be available in the specified stream immediately. diff --git a/docs/enterprise-releases.md b/docs/enterprise-releases.md deleted file mode 100644 index 8beb8b127..000000000 --- a/docs/enterprise-releases.md +++ /dev/null @@ -1,3 +0,0 @@ -# Timeplus Enterprise Releases - -The page is moved to [/release-notes](/release-notes). diff --git a/docs/enterprise-v2.5.md b/docs/enterprise-v2.5.md index 499dd0a0a..266d059c9 100644 --- a/docs/enterprise-v2.5.md +++ b/docs/enterprise-v2.5.md @@ -15,9 +15,9 @@ Key highlights of this release: * Connecting to various input or output systems via Redpanda Connect. [Learn more](/redpanda-connect). * Creating and managing users in the Web Console. You can change the password and assign the user either Administrator or Read-only role. * New [migrate](/cli-migrate) subcommand in [timeplus CLI](/cli-reference) for data migration and backup/restore. -* Materialized views auto-rebalancing in the cluster mode. [Learn more](/proton-create-view#auto-balancing). +* Materialized views auto-rebalancing in the cluster mode. [Learn more](/view#auto-balancing). * Approximately 30% faster data ingestion and replication in the cluster mode. -* Performance improvement for [ASOF JOIN](/joins) and [EMIT ON UPDATE](/query-syntax#emit_on_update). +* Performance improvement for [ASOF JOIN](/joins) and [EMIT ON UPDATE](/streaming-aggregations#emit_on_update). ## Supported OS {#os} |Deployment Type| OS | @@ -125,9 +125,9 @@ Compared to the [2.4.23](/enterprise-v2.4#2_4_23) release: * new type of [External Streams for Apache Pulsar](/pulsar-external-stream). * for bare metal installation, previously you can login with the username `default` with empty password. To improve the security, this user has been removed. * enhancement for nullable data types in streaming and historical queries. - * Materialized views auto-rebalancing in the cluster mode.[Learn more](/proton-create-view#auto-balancing). + * Materialized views auto-rebalancing in the cluster mode.[Learn more](/view#auto-balancing). * Approximately 30% faster data ingestion and replication in the cluster mode. - * Performance improvement for [ASOF JOIN](/joins) and [EMIT ON UPDATE](/query-syntax#emit_on_update). + * Performance improvement for [ASOF JOIN](/joins) and [EMIT ON UPDATE](/streaming-aggregations#emit_on_update). * timeplus_web 1.4.33 -> 2.0.6 * UI to add/remove user or change role and password. This works for both single node and cluster. * UI for inputs/outputs from Redpanda Connect. diff --git a/docs/enterprise-v2.6.md b/docs/enterprise-v2.6.md index 0f9183bfb..b6993a136 100644 --- a/docs/enterprise-v2.6.md +++ b/docs/enterprise-v2.6.md @@ -245,7 +245,7 @@ Compared to the [2.5.12](/enterprise-v2.5#2_5_12) release: * timeplusd 2.4.27 -> 2.5.10 * Performance Enhancements: * Introduced hybrid hash table technology for streaming SQL with JOINs or aggregations. Configure via `SETTINGS default_hash_table='hybrid'` to optimize memory usage for large data streams. - * Improved performance for [EMIT ON UPDATE](/query-syntax#emit_on_update) queries. Memory optimization available through `SETTINGS optimize_aggregation_emit_on_updates=false`. + * Improved performance for [EMIT ON UPDATE](/streaming-aggregations#emit_on_update) queries. Memory optimization available through `SETTINGS optimize_aggregation_emit_on_updates=false`. * Enhanced read/write performance for ClickHouse external tables with configurable `pooled_connections` setting (default: 3000). * Monitoring and Management: * Added [system.stream_state_log](/system-stream-state-log) and [system.stream_metric_log](/system-stream-metric-log) system streams for comprehensive resource monitoring. @@ -253,7 +253,7 @@ Compared to the [2.5.12](/enterprise-v2.5#2_5_12) release: * A `_tp_sn` column is added to each stream (except external streams or random streams), as the sequence number in the unified streaming and historical storage. This column is used for data replication among the cluster. By default, it is hidden in the query results. You can show it by setting `SETTINGS asterisk_include_tp_sn_column=true`. This setting is required when you use `INSERT..SELECT` SQL to copy data between streams: `INSERT INTO stream2 SELECT * FROM stream1 SETTINGS asterisk_include_tp_sn_column=true`. * New Features: * Support for continuous data writing to remote Timeplus deployments via setting a [Timeplus external stream](/timeplus-external-stream) as the target in a materialized view. - * New [EMIT PERIODIC .. REPEAT](/query-syntax#emit_periodic_repeat) syntax for emitting the last aggregation result even when there is no new event. + * New [EMIT PERIODIC .. REPEAT](/streaming-aggregations#emit_periodic_repeat) syntax for emitting the last aggregation result even when there is no new event. * Able to create or drop databases via SQL in a cluster. The web console will be enhanced to support different databases in the next release. * Historical data of a stream can be removed by `TRUNCATE STREAM stream_name`. * Able to add new columns to a stream via `ALTER STREAM stream_name ADD COLUMN column_name data_type`, in both a single node or multi-node cluster. diff --git a/docs/enterprise-v2.7.md b/docs/enterprise-v2.7.md index 638ebd1ea..f80825fbb 100644 --- a/docs/enterprise-v2.7.md +++ b/docs/enterprise-v2.7.md @@ -124,7 +124,7 @@ Component versions: Compared to the [2.7.5](#2_7_5) release: * timeplusd 2.7.37 -> 2.7.45 * added new setting [mv_preferred_exec_node](/sql-create-materialized-view#mv_preferred_exec_node) while creating materialized view - * added new EMIT policy `EMIT ON UPDATE WITH DELAY`. The SQL syntax for EMIT has been refactored. [Learn more](/query-syntax#emit) + * added new EMIT policy `EMIT ON UPDATE WITH DELAY`. The SQL syntax for EMIT has been refactored. [Learn more](/streaming-aggregations#emit) * fixed global aggregation with `EMIT ON UPDATE` in multi-shard environments * fixed concurrency issues in hybrid aggregation * support incremental checkpoints for hybrid hash join @@ -300,7 +300,7 @@ Compared to the [2.6.0](/enterprise-v2.6#2_6_0) release: * **Delete data:** You can now delete data from streams with the [DELETE](/sql-delete) SQL command. This is optimized for mutable streams with primary keys in the condition. * `SYSTEM UNPAUSE MATERIALIZED VIEW` command is renamed to [SYSTEM RESUME MATERIALIZED VIEW](/sql-system-resume). * Able to configure `license_key_path` and `license_file_path` in the `server/config.yaml` file to specify the license key without web console interaction. - * Introduced a simple way to setup multiple timeplusd processes on the same host by running the `timeplusd server --node-index=1` command. [Learn more](/cluster_install#single-host-cluster) + * Introduced a simple way to setup multiple timeplusd processes on the same host by running the `timeplusd server --node-index=1` command. [Learn more](/bare-metal-install#single-host-cluster) * To improve performance, we have optimized the schema for [system.stream_metric_log](/system-stream-metric-log) and [system.stream_state_log](/system-stream-state-log). * Security Enhancements: * **Support IAM authentication for accessing Amazon MSK:** Avoid storing static credentials in Kafka external streams by setting `sasl_mechanism` to `AWS_MSK_IAM`. diff --git a/docs/enterprise-v2.8.md b/docs/enterprise-v2.8.md index 556558d70..67a2412fd 100644 --- a/docs/enterprise-v2.8.md +++ b/docs/enterprise-v2.8.md @@ -11,7 +11,7 @@ Each component maintains its own version numbers. The version number for each Ti ## Key Highlights Key highlights of this release: -* New Compute Node server role to [run materialized views elastically](/proton-create-view#autoscaling_mv) with checkpoints on S3 storage. +* New Compute Node server role to [run materialized views elastically](/view#autoscaling_mv) with checkpoints on S3 storage. * Timeplus can read or write data in Apache Iceberg tables. [Learn more](/iceberg) * Timeplus can read or write PostgreSQL tables directly via [PostgreSQL External Table](/pg-external-table) or look up data via [dictionaries](/sql-create-dictionary#source_pg). * Use S3 as the [tiered storage](/tiered-storage) for streams. @@ -62,9 +62,9 @@ Compared to the [2.8.0 (Preview)](#2_8_0) release: * When using `CREATE OR REPLACE FORMAT SCHEMA` to update an existing schema, and using `DROP FORMAT SCHEMA` to delete a schema, Timeplus will clean up the Protobuf schema cache to avoid misleading errors. * Support writing Kafka message timestamp via [_tp_time](/proton-kafka#_tp_time) * Enable IPv6 support for KeyValueService - * Simplified the [EMIT syntax](/query-syntax#emit) to make it easier to read and use. - * Support [EMIT ON UPDATE WITH DELAY](/query-syntax#emit_on_update_with_delay) - * Support [EMIT ON UPDATE](/query-syntax#emit_on_update) for multiple shards + * Simplified the [EMIT syntax](/streaming-aggregations#emit) to make it easier to read and use. + * Support [EMIT ON UPDATE WITH DELAY](/streaming-aggregations#emit_on_update_with_delay) + * Support [EMIT ON UPDATE](/streaming-aggregations#emit_on_update) for multiple shards * Transfer leadership to preferred node after election * Pin materialized view execution node [Learn more](/sql-create-materialized-view#mv_preferred_exec_node) * Improve async checkpointing diff --git a/docs/eventtime.md b/docs/eventtime.md deleted file mode 100644 index 3d188e279..000000000 --- a/docs/eventtime.md +++ /dev/null @@ -1,54 +0,0 @@ -# _tp_time (Event time) - -## All data with an event time - -Streams are where data live and each data contains a `_tp_time` column as the event time. Timeplus takes this attribute as one important identity of an event. - -Event time is used to identify when the event is generated, like a birthday to a human being. It can be the exact timestamp when the order is placed, when the user logins a system, when an error occurs, or when an IoT device reports its status. If there is no suitable timestamp attribute in the event, Timeplus will generate the event time based on the data ingestion time. - -By default, the `_tp_time` column is in `datetime64(3, 'UTC')` type with millisecond precision. You can also create it in `datetime` type with second precision. - -When you are about to create a new stream, please choose the right column as the event time. If no column is specified, then Timeplus will use the current timestamp as the value of `_tp_time` It's not recommended to rename a column as \_tp_time at the query time, since it will lead to unexpected behaviour, specially for [Time Travel](/usecases#s-time-travel). - -## Why event time is treated differently - -Event time is used almost everywhere in Timeplus data processing and analysis workflow: - -- while doing time window based aggregations, such as [tumble](/functions_for_streaming#tumble) or [hop](/functions_for_streaming#hop) to get the downsampled data or outlier in each time window, Timeplus will use the event time to decide whether certain event belongs to a specific window -- in such time sensitive analytics, event time is also used to identify out of order events or late events, and drop them in order to get timely streaming insights. -- when one data stream is joined with the other, event time is the key to collate the data, without expecting two events to happen in exactly the same millisecond. -- event time also plays an important role to device how long the data will be kept in the stream - -## How to specify the event time - -### Specify during data ingestion - -When you [ingest data](/ingestion) into Timeplus, you can specify an attribute in the data which best represents the event time. Even if the attribute is in `String` type, Timeplus will automatically convert it to a timestamp for further processing. - -If you don't choose an attribute in the wizard, then Timeplus will use the ingestion time to present the event time, i.e. when Timeplus receives the data. This may work well for most static or dimensional data, such as city names with zip codes. - -### Specify during query - -The [tumble](/functions_for_streaming#tumble) or [hop](/functions_for_streaming#hop) window functions take an optional parameter as the event time column. By default, we will use the event time in each data. However you can also specify a different column as the event time. - -Taking an example for taxi passengers. The data stream can be - -| car_id | user_id | trip_start | trip_end | fee | -| ------ | ------- | ------------------- | ------------------- | --- | -| c001 | u001 | 2022-03-01 10:00:00 | 2022-03-01 10:30:00 | 45 | - -The data may come from a Kafka topic. When it's configured, we may set `trip_end` as the (default) event time. So that if we want to figure out how many passengers in each hour, we can run query like this - -```sql -select count(*) from tumble(taxi_data,1h) group by window_end -``` - -This query uses `trip_end` , the default event time, to run the aggregation. If the passenger ends the trip on 00:01 at midnight, it will be included in the 00:00-00:59 time window. - -In some cases, you as the analyst, may want to focus on how many passengers get in the taxi, instead of leaving the taxi, in each hour, then you can set `trip_start` as the event time for the query via `tumble(taxi_data,trip_start,1h)` - -Full query: - -```sql -select count(*) from tumble(taxi_data,trip_start,1h) group by window_end -``` diff --git a/docs/external-stream.md b/docs/external-stream.md index a2c681e51..9a59abd43 100644 --- a/docs/external-stream.md +++ b/docs/external-stream.md @@ -1,12 +1,13 @@ - - # External Stream -You can also create **External Streams** in Timeplus to query data in the external systems without loading the data into Timeplus. The main benefit for doing so is to keep a single source of truth in the external systems (e.g. Apache Kafka), without duplicating them. In many cases, this can also achieve even lower latency to process Kafka or Pulsar data, because the data is read directly by Timeplus core engine, without other components, such as Redpanda Connect or [Airbyte](https://airbyte.com/connectors/timeplus). +You can create **External Streams** in Timeplus to query data in the external systems without loading the data into Timeplus. The main benefit for doing so is to keep a single source of truth in the external systems (e.g. Apache Kafka), without duplicating them. In many cases, this can also achieve even lower latency to process Kafka or Pulsar data, because the data is read directly by Timeplus core engine, without other components, such as Redpanda Connect or [Airbyte](https://airbyte.com/connectors/timeplus). You can run streaming analytics with the external streams in the similar way as other streams, with [some limitations](/proton-kafka#limitations). -Timeplus supports 3 types of external streams: +Timeplus supports 4 types of external streams: * [Kafka External Stream](/proton-kafka) +* [Pulsar External Stream](/pulsar-external-stream) * [Timeplus External Stream](/timeplus-external-stream), only available in Timeplus Enterprise -* Log External Stream (experimental) +* [Log External Stream](/log-stream) (experimental) + +Besides external streams, Timeplus also provides external tables to query data in ClickHouse, MySQL, Postgres or S3/Iceberg. The difference of external tables and external streams is that external tables are not real-time, and they are not designed for streaming analytics. You can use external tables to query data in the external systems, but you cannot run streaming SQL on them. [Learn more about external tables](/proton-clickhouse-external-table). diff --git a/docs/proton-howto.md b/docs/faq.md similarity index 81% rename from docs/proton-howto.md rename to docs/faq.md index e7915c963..be0df032c 100644 --- a/docs/proton-howto.md +++ b/docs/faq.md @@ -1,37 +1,7 @@ -# Timeplus Proton +# Timeplus Enterprise FAQ -## How to install Proton {#install} +This document provides answers to frequently asked questions about Timeplus Enterprise, including its features, usage, and troubleshooting. -Proton can be installed as a single binary on Linux or Mac, via: - -```shell -curl https://install.timeplus.com/oss | sh -``` - -Once the `proton` binary is available, you can run Timeplus Proton in different modes: - -- **Local Mode.** You run `proton local` to start it for fast processing on local and remote files using SQL without having to install a full server -- **Config-less Mode.** You run `proton server` to start the server and put the config/logs/data in the current folder `proton-data`. Then use `proton client` in the other terminal to start the SQL client. -- **Server Mode.** You run `sudo proton install` to install the server in predefined path and a default configuration file. Then you can run `sudo proton server -C /etc/proton-server/config.yaml` to start the server and use `proton client` in the other terminal to start the SQL client. - -For Mac users, you can also use [Homebrew](https://brew.sh/) to manage the install/upgrade/uninstall: - -```shell -brew tap timeplus-io/timeplus -brew install proton -``` - -You can also install Proton in Docker, Docker Compose or Kubernetes. - -```bash -docker run -d --pull always -p 8123:8123 -p 8463:8463 --name proton d.timeplus.com/timeplus-io/proton:latest -``` - -Please check [Server Ports](/proton-ports) to determine which ports to expose, so that other tools can connect to Timeplus, such as DBeaver. - -The [Docker Compose stack](https://github.com/timeplus-io/proton/tree/develop/examples/ecommerce) demonstrates how to read/write data in Kafka/Redpanda with external streams. - -Running the single node Proton via Kubernetes is possible. We recommend you [contact us](mailto:support@timeplus.com) to deploy Timeplus Enterprise for on-prem deployment. ## How to read/write Kafka or Redpanda {#kafka} @@ -88,7 +58,7 @@ CREATE STREAM stream SETTINGS event_time_column = 'timestamp'; ``` -Please note there will be the 4th column in the stream, which is \_tp_time as the [Event Time](/eventtime). +Please note there will be the 4th column in the stream, which is \_tp_time as the [Event Time](/glossary#event_time). To import CSV content, use the [file](https://clickhouse.com/docs/en/sql-reference/table-functions/file) table function to set the file path and header and data types. @@ -149,3 +119,24 @@ The following drivers are available: - https://github.com/timeplus-io/proton-java-driver JDBC and other Java clients - https://github.com/timeplus-io/proton-go-driver for Golang - https://github.com/timeplus-io/proton-python-driver for Python + + +## Get the number of failed materialized views via prometheus {#failed_mv} + +Follow the [Prometheus Integration](/prometheus) to access the metrics endpoint of timeplusd. + +You can use `TimeplusdMaterializedView_QueryStatus` metrics to check the status code of the materialized views. +``` +Initializing = 0, +CheckingDependencies = 1, +BuildingPipeline = 2, +ExecutingPipeline = 3, + +Error = 4, +Suspended = 5, +Paused = 6, + +AutoRecovering = 10, +Resuming = 11, +Recovering = 12, +``` diff --git a/docs/functions_for_agg.md b/docs/functions_for_agg.md index 0bb0e7c4b..1347cfc02 100644 --- a/docs/functions_for_agg.md +++ b/docs/functions_for_agg.md @@ -1,4 +1,4 @@ -# Aggregation +# Aggregation Functions ### count diff --git a/docs/functions_for_streaming.md b/docs/functions_for_streaming.md index 94bfbda36..a06b4297e 100644 --- a/docs/functions_for_streaming.md +++ b/docs/functions_for_streaming.md @@ -1,4 +1,4 @@ -# Streaming Processing +# Streaming Processing Functions The following functions are supported in streaming query, but not all of them support historical query. Please check the tag like this. diff --git a/docs/getting-started.md b/docs/getting-started.md deleted file mode 100644 index 2a0eb3f2f..000000000 --- a/docs/getting-started.md +++ /dev/null @@ -1,34 +0,0 @@ -# Getting started - -This tutorial guides you how to load data into Timeplus and run analytics queries over the data. To perform this tutorial, you need an active Timeplus account. - -## Add Data - -To help you quickly get started, we setup each workspace with the demo dataset. Please check the schema and common queries on [Demo Scenario](/usecases) page. You can explore and query the streams right away. - -Of course, you can load your own data, such as - -* [Upload a CSV file](/ingestion#streamgen) -* [Create a Kafka source](/ingestion#kafka) to load JSON documents from Confluent Cloud or Apache Kafka cluster. - -## Explore Data - -Open the **QUERY** page. You will see a list of data streams. Clicking on any of it will generate the `select * from ..` in the query editor. You can click the **RUN QUERY** button to run a [streaming tail](/query-syntax#streaming-tailing) for the incoming data. The recent 10 rows of results will be shown. More importantly, you can see the top values for each column and overall trend. This live view will give you a good understanding of incoming data structure and sample value. - -To add some filter conditions or change other parts of the query, you can either click the **CANCEL QUERY** button or use the **+** button on the top to open a new query tab. - -## Query Data - -SQL is the most common tool for data analysts. Timeplus supports powerful yet easy-to-use [query syntax](/query-syntax) and [functions](/functions). You can also follow the samples in [Demo Scenario](/usecases) to query data. - -## Visualize Data - -You can click the **VISUALIZATION** tab to turn a streaming query to a streaming chart with high FPS (frame-per-second). Choose the time column as X-axis and choose the numeric column with an aggregation method. You can add the chart to your home page. Out of box streaming dashboards will be added to Timeplus soon. - -In the meanwhile, it's possible to leverage other tools to visualize insights from Timeplus. Please contact us if you want to learn the details. - -## Send Data Out - -Streaming insights or downsampled data can be set to another Kafka topic, or notify certain users via email or slack. Run a streaming query, then click the arrow icon. You can choose one of the four destinations: Slack, Email, Kafka, Webhook. - -Sending data to other systems, such as Snowflake, is possible. Please contact us if you want to learn the details. diff --git a/docs/glossary.md b/docs/glossary.md index 83452e577..51b293d80 100644 --- a/docs/glossary.md +++ b/docs/glossary.md @@ -1,14 +1,14 @@ -# Key Concepts +# Key Terms and Concepts -This page lists key terms and concepts in Timeplus. Please check the sub-pages for more details. +This page lists key terms and concepts in Timeplus, from A to Z. ## bookmark {#bookmark} -Query bookmarks, only available in Timeplus Enterprise, not in Timeplus Proton. +Query bookmarks or SQL scripts, only available in Timeplus Enterprise, not in Timeplus Proton. You can save the common SQL statements as bookmarks. They can be run quickly in the web console by a single click. You can create, list, edit, remove bookmarks in the query page. -Both bookmarks and [views](/glossary#view) can help you easily re-run a query. However views are defined in the streaming database and you can query the view directly via `select .. from ..` But bookmarks are just UI shortcuts. When you click the bookmark, the original SQL statement will be pre-filled in the query console. You cannot run `select .. from my_bookmark` +Both bookmarks and [views](/glossary#view) can help you easily re-run a query. However views are defined in the Timeplus SQL engine and you can query the view directly via `select .. from ..` But bookmarks are just UI shortcuts. When you click the bookmark, the original SQL statement will be pre-filled in the query console. You cannot run `select .. from my_bookmark` @@ -22,23 +22,77 @@ CTEs can be thought of as alternatives to derived tables ([subquery](https://en. Only available in Timeplus Enterprise, not in Timeplus Proton. -You can create multiple dashboards in a workspace, and add multiple charts to a dashboard. You can also add [filters](/viz#filter) or Markdown (experimental). +You can create multiple dashboards, and add multiple charts to a dashboard. You can also add [filters](/viz#filter) or Markdown (experimental). -## event time +## event time {#event_time} -Event time is used to identify when the event is generated, like a birthday to a human being. It can be the exact timestamp when the order is placed, when the user logins a system, when an error occurs, or when an IoT device reports its status. If no suitable timestamp attribute in the event, Timeplus will generate the event time based on the data ingestion time. +Each row in Timeplus streams contains a `_tp_time` column as the event time. Timeplus takes this attribute as one important identity of an event. -Learn more: [Event time](/eventtime) +Event time is used to identify when the event is generated, like a birthday to a human being. It can be the exact timestamp when the order is placed, when the user logins a system, when an error occurs, or when an IoT device reports its status. If there is no suitable timestamp attribute in the event, Timeplus will generate the event time based on the data ingestion time. -## generator {#generator} +By default, the `_tp_time` column is in `datetime64(3, 'UTC')` type with millisecond precision. You can also create it in `datetime` type with second precision. + +When you are about to create a new stream, please choose the right column as the event time. If no column is specified, then Timeplus will use the current timestamp as the value of `_tp_time` It's not recommended to rename a column as \_tp_time at the query time, since it will lead to unexpected behaviour, specially for [Time Travel](/usecases#s-time-travel). + +Event time is used almost everywhere in Timeplus data processing and analysis workflow: + +- while doing time window based aggregations, such as [tumble](/functions_for_streaming#tumble) or [hop](/functions_for_streaming#hop) to get the downsampled data or outlier in each time window, Timeplus will use the event time to decide whether certain event belongs to a specific window +- in such time sensitive analytics, event time is also used to identify out of order events or late events, and drop them in order to get timely streaming insights. +- when one data stream is joined with the other, event time is the key to collate the data, without expecting two events to happen in exactly the same millisecond. +- event time also plays an important role to device how long the data will be kept in the stream + +### how to specify the event time + +#### Specify during data ingestion + +When you [ingest data](/ingestion) into Timeplus, you can specify an attribute in the data which best represents the event time. Even if the attribute is in `String` type, Timeplus will automatically convert it to a timestamp for further processing. + +If you don't choose an attribute in the wizard, then Timeplus will use the ingestion time to present the event time, i.e. when Timeplus receives the data. This may work well for most static or dimensional data, such as city names with zip codes. + +#### Specify during query + +The [tumble](/functions_for_streaming#tumble) or [hop](/functions_for_streaming#hop) window functions take an optional parameter as the event time column. By default, we will use the event time in each data. However you can also specify a different column as the event time. + +Taking an example for taxi passengers. The data stream can be + +| car_id | user_id | trip_start | trip_end | fee | +| ------ | ------- | ------------------- | ------------------- | --- | +| c001 | u001 | 2022-03-01 10:00:00 | 2022-03-01 10:30:00 | 45 | + +The data may come from a Kafka topic. When it's configured, we may set `trip_end` as the (default) event time. So that if we want to figure out how many passengers in each hour, we can run query like this + +```sql +select count(*) from tumble(taxi_data,1h) group by window_end +``` + +This query uses `trip_end` , the default event time, to run the aggregation. If the passenger ends the trip on 00:01 at midnight, it will be included in the 00:00-00:59 time window. + +In some cases, you as the analyst, may want to focus on how many passengers get in the taxi, instead of leaving the taxi, in each hour, then you can set `trip_start` as the event time for the query via `tumble(taxi_data,trip_start,1h)` + +Full query: + +```sql +select count(*) from tumble(taxi_data,trip_start,1h) group by window_end +``` -Only available in Timeplus Enterprise, not in Timeplus Proton. -Learn more [Streaming Generator](/stream-generator) +## external stream {#external_stream} + +You can create external streams to read or write data from/to external systems in the streaming way. Timeplus supports external streams to Apache Kafka, Apache Pulsar, and other streaming data platforms. + +Learn more: [External Stream](/external-stream) + +## external table {#external_table} +You can create external tables to read or write data from/to external systems in the non-streaming way. Timeplus supports external tables to ClickHouse, PostgreSQL, MySQL, etc. + ## materialized view {#mview} -A special view that is kept running in the background and persistent the query results in an internal stream. +Real-time data pipelines are built via materialized views in Timeplus. + +The difference between a materialized view and a regular view is that the materialized view is running in background after creation and the resulting stream is physically written to internal storage (hence it's called materialized). + +Once the materialized view is created, Timeplus will run the query in the background continuously and incrementally emit the calculated results according to the semantics of its underlying streaming select. ## query {#query} @@ -50,13 +104,13 @@ Learn more: [Streaming Query](/stream-query) and [Non-Streaming Query](/history) a.k.a. destination. Only available in Timeplus Enterprise, not in Timeplus Proton. -Timeplus enables you to send real-time insights to other systems, either to notify individuals or power up downstream applications. +Timeplus enables you to send real-time insights or transformed data to other systems, either to notify individuals or power up downstream applications. Learn more: [Destination](/destination). ## source {#source} -A source is a background job in Timeplus Enterprise to load data into a [stream](#stream). For Kafka API compatible streaming data platform, you need to create external streams. +A source is a background job in Timeplus Enterprise to load data into a [stream](#stream). For Kafka API compatible streaming data platform, you need to create [Kafka external streams](/proton-kafka). Learn more: [Data Collection](/ingestion) @@ -66,26 +120,12 @@ Timeplus is a streaming analytics platform and data lives in streams. Timeplus ` Learn more: [Stream](/working-with-streams) -## external stream {#external_stream} - -You can create external streams to read data from Kafka API compatible streaming data platform. - -Learn more: [External Stream](/external-stream) - ## timestamp column -When you create a source and preview the data, you can choose a column as the timestamp column. Timeplus will use this column as the [event time](/glossary#event-time) and track the lifecycle of the event and process it for all time related computation/aggregation. +When you create a source and preview the data, you can choose a column as the timestamp column. Timeplus will use this column as the [event time](/glossary#event_time) and track the lifecycle of the event and process it for all time related computation/aggregation. ## view {#view} You can define reusable SQL statements as views, so that you can query them as if they are streams `select .. from view1 ..` By default, views don't take any extra computing or storage resources. They are expanded to the SQL definition when they are queried. You can also create materialized views to 'materialize' them (keeping running them in the background and saving the results to the disk). Learn more: [View](/view) and [Materialized View](/view#m_view) - -## workspace {#workspace} - -Only available in Timeplus Enterprise, not in Timeplus Proton. - -A workspace is the isolated storage and computing unit for you to run streaming data collection and analysis. Every user can create up to 1 free workspace and join many workspaces. Usually a group of users in the same organization join the same workspace, to build one or more streaming analytics solutions. - -By default, each workspace can save up to 20GB data and with a limit for concurrent queries. If you need more resources, please contact support@timeplus.com to increase the limit. diff --git a/docs/history.md b/docs/history.md index 9cef27f54..3b30218f7 100644 --- a/docs/history.md +++ b/docs/history.md @@ -1,4 +1,4 @@ -# OLAP Query +# Historical Data Processing In addition to stream processing, Timeplus also store and serve for historical data, like many OLAP databases. By default, data are saved in Timeplus' columnar storage, with optional secondary indexes. For [mutable streams](/mutable-stream), historical data are saved in row-based storage, for fast update and range queries. diff --git a/docs/ingestion.md b/docs/ingestion.md index 2b1becfcb..c13dafa5d 100644 --- a/docs/ingestion.md +++ b/docs/ingestion.md @@ -8,7 +8,7 @@ Timeplus supports multiple ways to load data into the system, or access the exte - On Timeplus web console, you can also [upload CSV files](#csv) and import them into streams. - For Timeplus Enterprise, [REST API](/ingest-api) and SDKs are provided to push data to Timeplus programmatically. - On top of the REST API and SDKs, Timeplus Enterprise adds integrations with [Kafka Connect](/kafka-connect), [AirByte](https://airbyte.com/connectors/timeplus), [Sling](/sling), and seatunnel. -- Last but not the least, if you are not ready to load your real data into Timeplus, or just want to play with the system, you can use the web console to [create sample streaming data](#streamgen), or [use SQL to create random streams](/proton-create-stream#create-random-stream). +- Last but not the least, if you are not ready to load your real data into Timeplus, or just want to play with the system, you can use the web console to [create sample streaming data](#streamgen), or use SQL to create random streams. ## Add new data via web console @@ -35,7 +35,7 @@ Choose "Data Collection" from the navigation menu to setup data access to other As of today, Kafka is the primary data integration for Timeplus. With our strong partnership with Confluent, you can load your real-time data from Confluent Cloud, Confluent Platform, or Apache Kafka into the Timeplus streaming engine. You can also create [external streams](/external-stream) to analyze data in Confluent/Kafka/Redpanda without moving data. -[Learn more.](/kafka-source) +[Learn more.](/proton-kafka) ### Load streaming data from Apache Pulsar {#pulsar} diff --git a/docs/install.md b/docs/install.md deleted file mode 100644 index d63c9be29..000000000 --- a/docs/install.md +++ /dev/null @@ -1,9 +0,0 @@ -# Install - -## Timeplus Enterprise self-hosted{#self-hosted} - -Install Timeplus Enterprise with high availability and scalability in your own data center or cloud account, using the [bare metal installer](/singlenode_install#bare_metal) or the official Timeplus [Kubernetes Helm Chart](/k8s-helm). - -## Timeplus Proton{#proton} - -The open source core engine can be installed locally. Please check [the Quickstart guide](proton-howto). diff --git a/docs/integration-metabase.md b/docs/integration-metabase.md deleted file mode 100644 index d5be33b04..000000000 --- a/docs/integration-metabase.md +++ /dev/null @@ -1,3 +0,0 @@ -# Integration with Metabase - -For self-hosted Timeplus Enterprise or Timeplus Proton, you can use [the Timeplus plugin for Metabase](https://github.com/timeplus-io/metabase-proton-driver) to visualize the SQL results with JDBC driver. diff --git a/docs/issues.md b/docs/issues.md deleted file mode 100644 index 2bfcbfee0..000000000 --- a/docs/issues.md +++ /dev/null @@ -1,13 +0,0 @@ -# Known Issues and Limitations - -We continuously improve the product. Please be aware of the following known issues and limitations. - -## UI - -* You can use a mobile browser to access Timeplus Enterprise. But only Google Chrome desktop browser is supported to use the Timeplus Console. -* Users in the same workspace will see all activity history and definitions, such as views, sinks, dashboards. Fine grained access control for user/group/role will be provided in the future. - -## Backend - -* When you define a stream, a materialized view, or a sink, you should avoid using `window_start` and `window_end` as the column names. This will conflict with the dynamic column names `window_start` and `window_end` for `tumble`, `hop`, `session` windows. You can create alias while creating materialized views, e.g. `select window_start as windowStart, window_end as windowEnd, count(*) as cnt from tumble(stream,10s) group by window_start, window_end` -* You can save JSON documents either in `string` or `json` column types. `string` type accepts any JSON schema or even invalid JSON. It also works well with dynamic schema, e.g. `{"type":"a","payload":{"id":1,"attr1":0.1}}` and `{"type":"b","payload":{"id":"2","attr2":true}}` While the `json` column type works best with fixed JSON schema, with better query performance and more efficient storage. For the above example, it will try to change the data type to support existing data. `payload.id` will be in `int` since the first event is `1`. Then it will change to `string`, to support both `1` and `"2"`. If you define the column as `json`, running non-streaming query for the dataset will get a `string` type for `payload.id` However data type cannot be changed during the execution of streaming queries. If you run `SELECT col.payload.id as id FROM json_stream` and insert `{.."payload":{"id":1..}` the first column in the streaming query result will be in `int`. Then if the future event is `{.."payload":{"id":"2"..}`, we cannot dynamically change this column from `int` to `string`, thus the streaming query will fail. In short, both `string` and `json` work great for non-streaming query. If the JSON documents with dynamic schema, it's recommended to define the column in `string ` type. diff --git a/docs/joins.md b/docs/joins.md index cf20f3028..c50a12877 100644 --- a/docs/joins.md +++ b/docs/joins.md @@ -1,4 +1,4 @@ -# Multi-JOINs and ASOF JOINs +# Streaming Joins JOIN is a key feature in Timeplus to combine data from different sources and freshness into a new stream. diff --git a/docs/k8s-helm.md b/docs/k8s-helm.md index 7c9f09a7d..69671e457 100644 --- a/docs/k8s-helm.md +++ b/docs/k8s-helm.md @@ -40,7 +40,7 @@ timeplus/timeplus-enterprise v6.0.4 2.7.1 Helm chart for deploying timeplus/timeplus-enterprise v6.0.3 2.7.0 Helm chart for deploying a cluster of Timeplus ... ``` -Staring from v3.0.0 chart version, the `APP VERSION` is the same version as [Timeplus Enterprise](/enterprise-releases). Prior to v3.0.0 chart version, the `APP VERSION` is the same version as the timeplusd component. +Staring from v3.0.0 chart version, the `APP VERSION` is the same version as [Timeplus Enterprise](/release-notes). Prior to v3.0.0 chart version, the `APP VERSION` is the same version as the timeplusd component. ### Create Namespace @@ -233,7 +233,7 @@ This helm chart follows [Semantic Versioning](https://semver.org/). It is always #### Check if there is an incompatible breaking change needing manual actions Each major chart version contains a new major Timeplus Enterprise version. If you are not going to upgrade the major version, you can just go ahead to run the helm upgrade command. Otherwise, please check: -1. The [release notes](/release-notes) of Timeplus Enterprise to confirm the target version can be upgraded in-place, by reusing the current data and configuration. For example [2.3](/enterprise-v2.3) and [2.4](/enterprise-releases) are incompatible and you have to use migration tools. +1. The [release notes](/release-notes) of Timeplus Enterprise to confirm the target version can be upgraded in-place, by reusing the current data and configuration. For example [2.3](/enterprise-v2.3) and [2.4](/enterprise-v2.4) are incompatible and you have to use migration tools. 2. The [upgrade guide](#upgrade-guide) of helm chart. You may need to modify your `values.yaml` according to the guide before upgrade. #### Run helm upgrade diff --git a/docs/kafka-source.md b/docs/kafka-source.md deleted file mode 100644 index 6d25fd552..000000000 --- a/docs/kafka-source.md +++ /dev/null @@ -1,27 +0,0 @@ -# Load streaming data from Apache Kafka - -Apache Kafka is the primary data source (and sink) for Timeplus. You can also create [external streams](/external-stream) to analyze data in Confluent/Kafka/Redpanda without moving data. - -## Apache Kafka Source - -1. From the left side navigation menu, click **Data Collection**. Here, you’ll see ways to connect a source or external stream. Click **Apache Kafka** (external stream). -2. Enter the broker URL. You can also enable TLS or authentication, if needed. -3. Enter the name of the Kafka topic, and specify the ‘read as’ data format. We currently support JSON, AVRO and Text formats. - 1. If the data in the Kafka topic is in JSON format, but the schema may change over time, we recommend you choose Text. This way, the entire JSON document will be saved as a string, and you can apply JSON related functions to extract value, even if the schema changes. - 2. If you choose AVRO, there is an option for 'Auto Extraction'. By default, this is toggled off, meaning the entire message will be saved as a string. If you toggle it on, then the top level attribute in the AVRO message will be put into different columns. This would be more convenient for you to query, but won't support schema evolution. When AVRO is selected, you also need to specify the address, API key, and secret key for the schema registry. -4. In the next “Preview” step, we will show you at least one event from your specified Apache Kafka source. -5. By default, your new source will create a new stream in Timeplus. Give this new stream a name and verify the columns information (column name and data type). You can also set a column as the event time column. If you don’t, we will use the ingestion time as the event time. Alternatively, you can select an existing stream from the dropdown. -6. After previewing your data, you can give the source a name and an optional description, and review the configuration. Once you click Finish, your streaming data will be available in the specified stream immediately. - -## Custom Kafka Deployment - -Similar steps as above. Please make sure Timeplus can reach out to your Kafka broker(s). You can use tools like [ngrok](https://ngrok.com) to securely expose your local Kafka broker(s) to the internet, so that Timeplus can connect to it. Check out [this blog](https://www.timeplus.com/post/timeplus-cloud-with-ngrok) for more details. - -## Notes for Kafka source - -Please note: - -1. Currently we support JSON and AVRO formats for the messages in Kafka topics -2. The topic level JSON attributes will be converted to stream columns. For nested attributes, the element will be saved as a `String` column and later you can query them with one of the [JSON functions](/functions_for_json). -3. Values in number or boolean types in the JSON message will be converted to corresponding types in the stream. -4. Datetime or timestamp will be saved as a String column. You can convert them back to DateTime via [to_time function](/functions_for_type#to_time). diff --git a/docs/proton-log.md b/docs/log-stream.md similarity index 63% rename from docs/proton-log.md rename to docs/log-stream.md index 12370c1a6..44ea74bc4 100644 --- a/docs/proton-log.md +++ b/docs/log-stream.md @@ -1,12 +1,6 @@ -# Read Log Files +# Log Files -You can use Proton as a lightweight and high-performance tool for log analysis. Please check [the blog](https://www.timeplus.com/post/log-stream-analysis) for more details. - -:::info - -Please note this feature is in Technical Preview. More settings to be added/tuned. - -::: +You can use Timeplus as a lightweight and high-performance tool for log analysis. Please check [the blog](https://www.timeplus.com/post/log-stream-analysis) for more details. ## Syntax @@ -30,6 +24,3 @@ The required settings: * log_dir * timestamp_regex * row_delimiter. Only 1 capturing group is expected in the regex. - - - diff --git a/docs/mutable-stream.md b/docs/mutable-stream.md index b35803c63..f6c51b6a1 100644 --- a/docs/mutable-stream.md +++ b/docs/mutable-stream.md @@ -189,7 +189,7 @@ Mutable stream can also be used in [JOINs](/joins). ### Retention Policy for Historical Storage{#ttl_seconds} Like normal streams in Timeplus, mutable streams use both streaming storage and historical storage. New data are added to the streaming storage first, then continuously write to the historical data with deduplication/merging process. -Starting from Timeplus Enterprise 2.9 (also backported to 2.8.2), you can set `ttl_seconds` on mutable streams. If the data is older than this value, it is scheduled to be pruned in the next key compaction cycle. Default value is -1. Any value less than 0 means this feature is disabled. +Starting from Timeplus Enterprise 2.9 (also backported to 2.8.2), you can set `ttl_seconds` on mutable streams. If the data's age (based on when the data is inserted, not _tp_time or particular columns) is older than this value, it is scheduled to be pruned in the next key compaction cycle. Default value is -1. Any value less than 0 means this feature is disabled. ```sql CREATE MUTABLE STREAM .. @@ -286,7 +286,7 @@ SETTINGS shards=3 ``` ### Coalesced and Versioned Mutable Stream {#coalesced} -For a mutable stream with many columns, there are some cases that only some columns are updated over time. Create a mutable stream with `coalesced=true` setting to enable the partial merge. For example, given a mutable stream: +For a mutable stream with many columns, there are some cases that only some columns are updated over time. Create a mutable stream with [Column Family](#column_family) and `coalesced=true` setting to enable the partial merge. For example, given a mutable stream: ```sql create mutable stream kv_99061_1 ( p string, m1 int, m2 int, m3 int, v uint64, diff --git a/docs/private-beta-2.md b/docs/private-beta-2.md index 495a13800..84cf30b53 100644 --- a/docs/private-beta-2.md +++ b/docs/private-beta-2.md @@ -12,7 +12,7 @@ We will update the beta version from time to time and list key enhancements in t * Enhanced `dedup` function to only cache the unique keys for a given time period. This is useful to suppress the same alerts in the short time period. * Support sub-stream, e.g. `select cid,speed_kmh, lag(speed_kmh) OVER (PARTITION BY cid) as last_spd from car_live_data` * Source, sink, API and SDK - * Updated Python SDK https://pypi.org/project/timeplus/ to auto-delete the query history, refine error handling, Please note there is a breaking change, `Env().tenant(id)` is changed to `Env().workspace(id)` to be align with our [terminology](/glossary#workspace) + * Updated Python SDK https://pypi.org/project/timeplus/ to auto-delete the query history, refine error handling, Please note there is a breaking change, `Env().tenant(id)` is changed to `Env().workspace(id)` to be align with our terminology. * Updated the REST API to show the optional description for source/sink, and replace "tenant" with "workspace-id" in the documentation. * The Kafka sink no longer auto-create the topics diff --git a/docs/proton-create-stream.md b/docs/proton-create-stream.md deleted file mode 100644 index 9f2ffc979..000000000 --- a/docs/proton-create-stream.md +++ /dev/null @@ -1,148 +0,0 @@ -# Stream - -## CREATE STREAM - -[Stream](/working-with-streams) is a key [concept](/glossary) in Timeplus. All data lives in streams, no matter static data or data in motion. We don't recommend you to create or manage `TABLE` in Timeplus. - -### Append-only Stream - -By default, the streams are append-only and immutable. You can create a stream, then use `INSERT INTO` to add data. - -Syntax: - -```sql -CREATE STREAM [IF NOT EXISTS] [db.] -( - [DEFAULT ] [compression_codec_1], - [DEFAULT ] [compression_codec_2] -) -SETTINGS ='', =, =, ... -``` - -:::info - -Stream creation is an async process. - -::: - -If you omit the database name, `default` will be used. Stream name can be any utf-8 characters and needs backtick quoted if there are spaces in between. Column name can be any utf-8 characters and needs backtick quoted if there are spaces in between. - -#### Data types - -Timeplus Proton supports the following column types: - -1. int8/16/32/64/128/256 -2. uint8/16/32/64/128/256 -3. bool -4. decimal(precision, scale) : valid range for precision is [1: 76], valid range for scale is [0: precision] -5. float32/64 -6. date -7. datetime -8. datetime64(precision, [time_zone]) -9. string -10. fixed_string(N) -11. array(T) -12. uuid -13. ipv4/ipv6 - -For more details, please check [Data Types](/datatypes). - -#### Event Time - -In Timeplus, each stream with a `_tp_time` as [Event Time](/eventtime). If you don't create the `_tp_time` column when you create the stream, the system will create such a column for you, with `now64()` as the default value. You can also choose a column as the event time, using - -```sql -SETTINGS event_time_column='my_datetime_col' -``` - - It can be any sql expression which results in datetime64 type. - -#### Retention Policies - -Proton supports retention policies to automatically remove out-of-date data from the streams. - -##### For Historical Storage - -Proton leverages ClickHouse TTL expression for the retention policy of historical data. When you create the stream, you can add `TTL to_datetime(_tp_time) + INTERVAL 12 HOUR` to remove older events based a specific datetime column and retention period. - -##### For Streaming Storage - -You can set the retention policies for streaming storage when you create the stream or update the setting after creation. - -```sql -CREATE STREAM .. SETTINGS logstore_retention_bytes=.., logstore_retention_ms=..; - -ALTER STREAM .. MODIFY SETTINGS logstore_retention_bytes=.., logstore_retention_ms=..; -``` - -### Versioned Stream - -[Versioned Stream](/versioned-stream) allows you to specify the primary key(s) and focus on the latest value. For example: - -```sql -CREATE STREAM versioned_kv(i int, k string, k1 string) -PRIMARY KEY (k, k1) -SETTINGS mode='versioned_kv', version_column='i'; -``` - -The default `version_column` is `_tp_time`. For the data with same primary key(s), Proton will use the ones with maximum value of `version_column`. So by default, it tracks the most recent data for same primary key(s). If there are late events, you can use specify other column to determine the end state for your live data. - -### Changelog Stream - -[Changelog Stream](/changelog-stream) allows you to specify the primary key(s) and track the add/delete/update of the data. For example: - -```sql -CREATE STREAM changelog_kv(i int, k string, k1 string) -PRIMARY KEY (k, k1) -SETTINGS mode='changelog_kv', version_column='i'; -``` - -The default `version_column` is `_tp_time`. For the data with same primary key(s), Proton will use the ones with maximum value of `version_column`. So by default, it tracks the most recent data for same primary key(s). If there are late events, you can use specify other column to determine the end state for your live data. - -## CREATE RANDOM STREAM - -You may use this special stream to generate random data for tests. For example: - -```sql -CREATE RANDOM STREAM devices( - device string default 'device'||to_string(rand()%4), - location string default 'city'||to_string(rand()%10), - temperature float default rand()%1000/10); -``` - -The following functions are available to use: - -1. [rand](/functions_for_random#rand) to generate a number in uint32 -2. [rand64](/functions_for_random#rand64) to generate a number in uint64 -3. [random_printable_ascii](/functions_for_random#random_printable_ascii) to generate printable characters -4. [random_string](/functions_for_random#random_string) to generate a string -5. [random_fixed_string](/functions_for_random#random_fixed_string) to generate string in fixed length -7. [random_in_type](/functions_for_random#random_in_type) to generate value with max value and custom logic - -When you run a Timeplus SQL query with a random stream, the data will be generated and analyzed by the query engine. Depending on the query, all generated data or the aggregated states can be kept in memory during the query time. If you are not querying the random stream, there is no data generated or kept in memory. - -By default, Proton tries to generate as many data as possible. If you want to (roughly) control how frequent the data is generated, you can use the `eps` setting. For example, the following SQL generates 10 events every second: - -```sql -CREATE RANDOM STREAM rand_stream(i int default rand()%5) SETTINGS eps=10 -``` - -You can further customize the rate of data generation via the `interval_time` setting. For example, you want to generate 1000 events each second, but don't want all 1000 events are generated at once, you can use the following sample SQL to generate events every 200 ms. The default interval is 5ms (in Proton 1.3.27 or the earlier versions, the default value is 100ms) - -```sql -CREATE RANDOM STREAM rand_stream(i int default rand()%5) SETTINGS eps=1000, interval_time=200 -``` - -Please note, the data generation rate is not accurate, to balance the performance and flow control. - -:::info - -New in Proton v1.4.2, you can set eps less than 1. Such as `eps=0.5` will generate 1 event every 2 seconds. `eps` less than 0.00001 will be treated as 0. - -::: - -For testing or demonstration purpose, you can create a random stream with multiple columns and use the [table](/functions_for_streaming#table) function to generate random data at once. The number of rows generated by this way is predefined and subject to change. The current value is 65409. - -## CREATE EXTERNAL STREAM - -Please check [Read/Write Kafka with External Stream](/proton-kafka). diff --git a/docs/proton-create-view.md b/docs/proton-create-view.md deleted file mode 100644 index ead3a8d49..000000000 --- a/docs/proton-create-view.md +++ /dev/null @@ -1,154 +0,0 @@ -# View & Materialized View -Real-time data pipelines are built via [Materialized Views](#m_view) in Timeplus. - -There are two types of views in Timeplus: logical view (or just view) and materialized view. -## CREATE VIEW -Syntax: -```sql -CREATE VIEW [IF NOT EXISTS] AS -[SETTINGS ...] -``` - -It's required to create a materialized view using a streaming select query. Once the materialized view is created, Timeplus will run the query in the background continuously and incrementally emit the calculated results according to the semantics of its underlying streaming select. - -### Without Target Stream {#mv_internal_storage} -By default, when you create a materialized view without `INTO ..`, an internal stream will be created automatically as the data storage. Querying on the materialized view will result in querying the underlying internal stream. - -:::warning -While this approach is easy to use, it's not recommended for production data processing. The internal stream will be created with default settings, lack of optimization for sharding, retention, etc. -::: - -Querying on the materialized view will result in querying the underlying internal stream. Different ways to use the materialized views: - -1. Streaming mode: `SELECT * FROM materialized_view` Get the result for future data. This works in the same way as views. -2. Historical mode: `SELECT * FROM table(materialized_view)` Get all past results for the materialized view. -3. Historical + streaming mode: `SELECT * FROM materialized_view WHERE _tp_time>='1970-01-01'` Get all past results and as well as the future data. -4. Pre-aggregation mode: `SELECT * FROM table(materialized_view) where _tp_time in (SELECT max(_tp_time) as m from table(materialized_view))` This immediately returns the most recent query result. If `_tp_time` is not available in the materialized view, or the latest aggregation can produce events with different `_tp_time`, you can add the `emit_version()` to the materialized view to assign a unique ID for each emit and pick up the events with largest `emit_version()`. - -For example: - -```sql -CREATE MATERIALIZED VIEW mv AS -SELECT emit_version() AS version, window_start AS time, count() AS n, max(speed_kmh) AS h FROM tumble(car_live_data,10s) -GROUP BY window_start, window_end; - -SELECT * FROM table(mv) WHERE version IN (SELECT max(version) FROM table(mv)); -``` - -### Target Stream - -It's recommended to specify a target stream when creating a materialized view, no matter a stream in Timeplus, an external stream to Apache Kafka, Apache Pulsar, or external tables to ClickHouse, S3, Iceberg, etc. - -Use cases for specifying a target stream: - -1. In some cases, you may want to build multiple materialized views to write data to the same stream. In this case, each materialized view serves as a real-time data pipeline. -2. Or you may need to use [Changelog Stream](/proton-create-stream#changelog-stream), [Versioned Stream](/proton-create-stream#versioned-stream) or [Mutable Stream](/mutable-stream) to build lookups. -3. Or you may want to set the retention policy for the materialized view. -4. You can also use materialized views to write data to downstream systems, such as ClickHouse, Kafka, or Iceberg. - -To create a materialized view with the target stream: - -```sql -CREATE MATERIALIZED VIEW [IF NOT EXISTS] -INTO AS ``` -To delete a vanilla view +To drop a vanilla view: ```sql DROP VIEW [IF EXISTS] ``` +If the view is created based on a streaming query, then you can consider the view as a virtual stream. For example: +```sql +CREATE VIEW view1 AS SELECT * FROM my_stream WHERE c1 = 'a' +``` +This will create a virtual stream to filter all events with c1 = 'a'. You can use this view as if it's another stream, e.g. +```sql +SELECT count(*) FROM tumble(view1,1m) GROUP BY window_start +``` + +A view could be a bounded stream if the view is created with a bounded query using [table](/functions_for_streaming#table) function, e.g. +```sql +CREATE VIEW view2 AS SELECT * FROM table(my_stream) +``` +Then each time you run `SELECT count(*) FROM view2` will return the current row number of the my_stream immediately without waiting for the future events. + ### Parameterized Views Starting from Timeplus Enterprise 2.9, you can create views with parameters. For example: ```sql @@ -40,23 +53,130 @@ select * from github_param_view(limit=2); The difference between a materialized view and a regular view is that the materialized view is running in background after creation and the resulting stream is physically written to internal storage (hence it's called materialized). -To create a materialized view, click the 'Create View' button in the VIEWS page, and turn on the 'Materialized view?' toggle button, and specify the view name and SQL. - Once the materialized view is created, Timeplus will run the query in the background continuously and incrementally emit the calculated results according to the semantics of its underlying streaming select. -Different ways to use the materialized views: +### Create a Materialized View + +```sql +CREATE MATERIALIZED VIEW [IF NOT EXISTS] +[INTO ] +AS +``` + +### Use Materialized Views + +There are different ways to use the materialized views in Timeplus: 1. Streaming mode: `SELECT * FROM materialized_view` Get the result for future data. This works in the same way as views. 2. Historical mode: `SELECT * FROM table(materialized_view)` Get all past results for the materialized view. 3. Historical + streaming mode: `SELECT * FROM materialized_view WHERE _tp_time>='1970-01-01'` Get all past results and as well as the future data. 4. Pre-aggregation mode: `SELECT * FROM table(materialized_view) where _tp_time in (SELECT max(_tp_time) as m from table(materialized_view))` This immediately returns the most recent query result. If `_tp_time` is not available in the materialized view, or the latest aggregation can produce events with different `_tp_time`, you can add the `emit_version()` to the materialized view to assign a unique ID for each emit and pick up the events with largest `emit_version()`. For example: - ```sql - create materialized view mv as - select emit_version() as version, window_start as time, count() as n, max(speed_kmh) as h from tumble(car_live_data,10s) - group by window_start, window_end; +```sql +create materialized view mv as +select emit_version() as version, window_start as time, count() as n, max(speed_kmh) as h from tumble(car_live_data,10s) +group by window_start, window_end; + +select * from table(mv) where version in (select max(version) from table(mv)); +``` + +You build data pipelines in Timeplus using materialized views. + + +### Load Balancing + +It's common to define many materialized views in Timeplus for various computation and analysis. Some materialized views can be memory-consuming or cpu-consuming. + +In Timeplus Enterprise cluster mode, you can schedule the materialized views in a proper way to ensure each node gets similar workload. + +#### Manual Load Balancing {#memory_weight} + +Starting from [Timeplus Enterprise v2.3](/enterprise-v2.3), when you create a materialized view with DDL SQL, you can add an optional `memory_weight` setting for those memory-consuming materialized views, e.g. +```sql +CREATE MATERIALIZED VIEW my_mv +SETTINGS memory_weight = 10 +AS SELECT .. +``` + +When `memory_weight` is not set, by default the value is 0. When Timeplus Enterprise server starts, the system will list all materialized views, ordered by the memory weight and view names, and schedule them in the proper node. + +For example, in a 3-node cluster, you define 10 materialized views with names: mv1, mv2, .., mv9, mv10. If you create the first 6 materialized views with `SETTINGS memory_weight = 10`, then node1 will run mv1 and mv4; node2 will run mv2 and mv5; node3 will run mv3 and mv6; Other materialized views(mv7 to mv10) will be randomly scheduled on any nodes. + +It's recommended that each node in the Timeplus Enterprise cluster shares the same hardware specifications. For those resource-consuming materialized views, it's recommended to set the same `memory_weight`, such as 10, to get the expected behaviors to be dispatched to the proper nodes for load-balancing. + +#### Auto Load Balancing {#auto-balancing} + +Starting from [Timeplus Enterprise v2.5](/enterprise-v2.5), you can also apply auto-load-balancing for memory-consuming materialized views in Timeplus Enterprise cluster. By default, this feature is enabled and there are 3 settings at the cluster level: + +```yaml +workload_rebalance_check_interval: 30s +workload_rebalance_overloaded_memory_util_threshold: 50% +workload_rebalance_heavy_mv_memory_util_threshold: 10% +``` + +As the administrator, you no longer need to determine which materialized views need to set a `memory_weight` setting. In a cluster, Timeplus will monitor the memory consumption for each materialized view. Every 30 seconds, configurable via `workload_rebalance_check_interval`, the system will check whether there are any node with memory over 50% full. If so, check whether there is any materialized view in such node consuming 10% or more memory. When those conditions are all met, rescheduling those materialized views to less busy nodes. During the rescheduling, the materialized view on the previous node will be paused and its checkpoint will be transferred to the target node, then the materialized view on target node will resume the streaming SQL based on the checkpoint. + +### Auto-Scaling Materialized Views {#autoscaling_mv} +Starting from [Timeplus Enterprise v2.8](/enterprise-v2.8), materialized views can be configured to run on elastic compute nodes. This can reduce TCO (Total Cost of Ownership), by enabling high concurrent materialized views scheduling, auto scale-out and scale-in according to workload. + +To enable this feature, you need to +1. create a S3 disk in the `s3_plain` type. +2. create a materialized view by setting the checkpoint storage to `s3` and use the s3 disk. +3. enable compute nodes in the cluster, with optional autoscaling based on your cloud or on-prem infrastructure. + +For example: +```sql +--S3 based checkpoint +CREATE DISK ckpt_s3_disk disk( + type = 's3_plain', + endpoint = 'https://mat-view-ckpt.s3.us-west-2.amazonaws.com/matv_ckpt/', + access_key_id = '...', + secret_access_key = '...'); + +CREATE MATERIALIZED VIEW mat_v_scale INTO clickhouse_table +AS SELECT … +SETTINGS +checkpoint_settings=’storage_type=s3;disk_name=ckpt_s3_disk;async=true;interval=5’; +``` + +### Drop Materialized Views + +Run the following SQL to drop a view or a materialized view. + +```sql +DROP VIEW [IF EXISTS] db.; +``` + +Like [CREATE STREAM](/sql-create-stream), stream deletion is an async process. - select * from table(mv) where version in (select max(version) from table(mv)); - ``` +### Best Practices - We are considering providing new syntax to simplify this. +* It's recommended to specify [a target stream](#target-stream) when creating a materialized view, no matter a stream in Timeplus, an external stream to Apache Kafka, Apache Pulsar, or external tables to ClickHouse, S3, Iceberg, etc. diff --git a/docs/why-timeplus.md b/docs/why-timeplus.md index 9ceb28d15..6fd34b8f1 100644 --- a/docs/why-timeplus.md +++ b/docs/why-timeplus.md @@ -11,11 +11,11 @@ Timeplus streams offer high performance, resiliency, and seamless querying by us This architecture transparently serves data to users based on query type from both, often eliminating the need for Apache Kafka as a commit log or a separate downstream database, streamlining your data infrastructure. -## Append-Only and Mutable Streams {#streams} +## Append and Mutable Streams {#streams} Configure types of streams to optimize performance. -* [Append-only streams:](/proton-create-stream#append-only-stream) +* [Append streams:](/append-stream) Excel at complex aggregations, storing data in a columnar format for faster access and processing. * [Mutable streams:](/mutable-stream) Support UPSERTs and DELETEs, ideal for applications like Materialized Caches or GDPR compliance, using a row-based store optimized for fast data retrieval and query consistency. @@ -31,7 +31,7 @@ Stream processing involves combining multiple data sources, and [MULTI-JOINs](/j In many cases, Business Intelligence and analytical queries can be executed directly in Timeplus, eliminating the need for a separate data warehouse. [ASOF JOINs](/joins) enable approximate time-based lookups for comparing recent versus historical data. ## Python and JavaScript UDF {#udf} -We understand that SQL may not be able to express all business logic for streaming or querying. [JavaScript](/js-udf) and Python User Defined Functions (UDFs) and User Defined Aggregate Functions (UDAFs) can be used to extend Timeplus to encapsulate custom logic for both stateless and stateful queries. +We understand that SQL may not be able to express all business logic for streaming or querying. [JavaScript](/js-udf) and [Python](/py-udf) User Defined Functions (UDFs) and User Defined Aggregate Functions (UDAFs) can be used to extend Timeplus to encapsulate custom logic for both stateless and stateful queries. With Python UDFs, this opens up the possibility to bring in pre-existing and popular libraries, including data science and machine learning libraries! diff --git a/docs/working-with-streams.md b/docs/working-with-streams.md index 41fb988d8..3f6b58d57 100644 --- a/docs/working-with-streams.md +++ b/docs/working-with-streams.md @@ -1,4 +1,4 @@ -# Stream +# Streams ## All data live in streams diff --git a/docusaurus.config.js b/docusaurus.config.js index 2cf6a8eef..3558da408 100644 --- a/docusaurus.config.js +++ b/docusaurus.config.js @@ -7,6 +7,9 @@ const darkCodeTheme = themes.dracula; /** @type {import('@docusaurus/types').Config} */ const config = { + future: { + v4: true, + }, title: "Timeplus", tagline: "Simple, powerful, cost-efficient stream processing", url: "https://docs.timeplus.com/", @@ -137,12 +140,6 @@ const config = { docId: "destination", label: "Send Data Out", }, - { - type: "doc", - position: "left", - docId: "functions", - label: "SQL Reference", - }, { href: "https://www.timeplus.com", position: "left", diff --git a/list-pages.ts b/list-pages.ts new file mode 100644 index 000000000..0d5fcc957 --- /dev/null +++ b/list-pages.ts @@ -0,0 +1,159 @@ +import path from 'path'; + +// --- CONFIGURATION --- +const SIDEBARS_FILE_PATH = './sidebars.js'; +const DOCS_FOLDER_PATH = './docs'; +// ------------------- + +// Define types for better code analysis and safety. +// These types represent the structure of Docusaurus sidebar items. +type SidebarItemDoc = { + type: 'doc'; + id: string; + label?: string; +}; + +type SidebarItemLink = { + type: 'link'; + label: string; + href: string; +} + +type SidebarItemCategory = { + type: 'category'; + label: string; + items: SidebarItem[]; + link?: { type: 'doc' | 'generated-index', id?: string }; +}; + +// A sidebar item can be one of the above types, or a string shorthand for a doc. +type SidebarItem = SidebarItemDoc | SidebarItemCategory | SidebarItemLink | string; + +// The sidebars file exports an object where keys are sidebar names. +type SidebarsConfig = { + [sidebarName: string]: SidebarItem[]; +}; + +/** + * Scans a markdown/mdx file to find the first H1 header (e.g., "# Title"). + * This is more robust than checking only the first line, as MDX files can have + * imports or frontmatter before the title. + * @param docId The ID of the doc, which corresponds to the filename (without extension). + * @returns The extracted header title, or the original docId if not found. + */ +async function getTitleFromMarkdown(docId: string): Promise { + const mdxPath = path.resolve(DOCS_FOLDER_PATH, `${docId}.mdx`); + const mdPath = path.resolve(DOCS_FOLDER_PATH, `${docId}.md`); + + let filePath: string | null = null; + + if (await Bun.file(mdxPath).exists()) { + filePath = mdxPath; + } else if (await Bun.file(mdPath).exists()) { + filePath = mdPath; + } + + if (!filePath) { + return docId; // File not found, return ID as fallback. + } + + try { + const file = Bun.file(filePath); + const content = await file.text(); + + // Use a multiline regex to find the first H1 header anywhere in the file. + // The 'm' flag makes '^' match the beginning of a line, not just the string. + const match = content.match(/^#\s+(.*)/m); + + if (match && match[1]) { + // match[1] is the captured group (the text after '# '). + return match[1].trim(); + } + + } catch (error) { + console.warn(`Could not read file for docId: ${docId}`, error); + } + + // Fallback to the id if no H1 header is found in the file. + return docId; +} + +/** + * Determines the display name for a sidebar item. + * It prioritizes the explicit 'label'. If not present, it parses the doc file. + * @param item The sidebar item object. + * @returns The resolved display name. + */ +async function getItemName(item: SidebarItem): Promise { + // Handle string shorthand, e.g., 'my-doc-id' + if (typeof item === 'string') { + return getTitleFromMarkdown(item); + } + + // For object types, the label is always the primary source of truth. + if ('label' in item && item.label) { + // For links, add an indicator to show it's an external URL. + if (item.type === 'link') { + return `${item.label} ↗`; + } + return item.label; + } + + // If it's a doc without a label, parse the markdown file. + if (item.type === 'doc' && 'id' in item) { + return getTitleFromMarkdown(item.id); + } + + // Fallback for categories without a label (unlikely but possible). + return 'Untitled Category'; +} + +/** + * Recursively processes and prints the sidebar items in a tree-like format. + * @param items The array of sidebar items to process. + * @param prefix The string prefix for drawing tree lines (e.g., "│ "). + */ +async function printTree(items: SidebarItem[], prefix: string): Promise { + for (let i = 0; i < items.length; i++) { + const item = items[i]; + const isLast = i === items.length - 1; + + const connector = isLast ? '└──' : '├──'; + const name = await getItemName(item); + + console.log(`${prefix}${connector} ${name}`); + + if (typeof item === 'object' && item.type === 'category' && item.items) { + const childPrefix = prefix + (isLast ? ' ' : '│ '); + await printTree(item.items, childPrefix); + } + } +} + +/** + * Main function to load the sidebars and start the parsing process. + */ +async function main() { + console.log(`Parsing documentation structure from ${SIDEBARS_FILE_PATH}...\n`); + + try { + const sidebars: SidebarsConfig = require(path.resolve(SIDEBARS_FILE_PATH)); + + for (const [sidebarName, items] of Object.entries(sidebars)) { + console.log(sidebarName); + await printTree(items, ''); + console.log(''); + } + } catch (error) { + if (error instanceof Error && 'code' in error && error.code === 'MODULE_NOT_FOUND') { + console.error(`Error: Could not find the sidebars file at '${SIDEBARS_FILE_PATH}'.`); + console.error('Please make sure the path is correct and you are running the script from your project root.'); + } else { + console.error('An unexpected error occurred:', error); + } + process.exit(1); + } +} + +// Run the script +main(); \ No newline at end of file diff --git a/missing.js b/missing.js new file mode 100644 index 000000000..92c99de04 --- /dev/null +++ b/missing.js @@ -0,0 +1,146 @@ +// A script to find markdown files in the 'docs' folder that are not referenced in 'sidebars.js'. +// +// USAGE: +// 1. Place this file in the root of your Docusaurus project. +// 2. Run it from your terminal using `node ./mising.js` or `bun ./mising.js` +// +// It will print the list of unreferenced files to the console. + +const fs = require("fs/promises"); +const path = require("path"); + +// --- Configuration --- +// Adjust these paths if your project structure is different. +const SIDEBARS_PATH = path.join(process.cwd(), "sidebars.js"); +const DOCS_PATH = path.join(process.cwd(), "docs"); +// --- End Configuration --- + +/** + * Recursively traverses the sidebar object/array to find all doc IDs. + * @param {any} item - The current item in the sidebar structure. + * @param {Set} ids - The set to store the found IDs. + */ +function findReferencedIds(item, ids) { + if (!item) { + return; + } + + // Case 1: Item is an array (e.g., 'items' array or a whole sidebar) + if (Array.isArray(item)) { + item.forEach((subItem) => findReferencedIds(subItem, ids)); + return; + } + + // Case 2: Item is a string shorthand for a doc + // e.g., 'my-doc-id' + if (typeof item === "string") { + ids.add(item); + return; + } + + // Case 3: Item is an object + if (typeof item === "object") { + // Check for a doc type: { type: 'doc', id: '...' } + if (item.type === "doc" && item.id) { + ids.add(item.id); + } + // Check for a category's own link: { type: 'category', link: { type: 'doc', id: '...' } } + if (item.link && item.link.type === "doc" && item.link.id) { + ids.add(item.link.id); + } + // Recurse into 'items' array if it exists + if (item.items) { + findReferencedIds(item.items, ids); + } + } +} + +/** + * Main function to run the script. + */ +async function main() { + console.log("🔍 Starting analysis..."); + + // --- Step 1: Get all referenced doc IDs from sidebars.js --- + const referencedIds = new Set(); + try { + // Using require() is a simple way to load and execute the JS config file. + const sidebarsConfig = require(SIDEBARS_PATH); + console.log(`✅ Successfully loaded ${SIDEBARS_PATH}`); + + // Iterate over each sidebar defined in the config (e.g., 'docSidebar', 'tutorialSidebar') + for (const sidebarName in sidebarsConfig) { + findReferencedIds(sidebarsConfig[sidebarName], referencedIds); + } + console.log( + `Found ${referencedIds.size} unique doc IDs referenced in sidebars.`, + ); + } catch (error) { + console.error( + `❌ Error reading or parsing ${SIDEBARS_PATH}:`, + error.message, + ); + process.exit(1); + } + + // --- Step 2: Get all file IDs from the docs folder --- + const diskFileIds = new Set(); + try { + const files = await fs.readdir(DOCS_PATH); + console.log(`✅ Successfully scanned ${DOCS_PATH}`); + + files.forEach((file) => { + // We only care about .md and .mdx files + if (file.endsWith(".md") || file.endsWith(".mdx")) { + // The ID is the filename without the extension + const id = path.basename(file, path.extname(file)); + diskFileIds.add(id); + } + }); + console.log(`Found ${diskFileIds.size} markdown files in the docs folder.`); + } catch (error) { + console.error( + `❌ Error reading docs directory at ${DOCS_PATH}:`, + error.message, + ); + process.exit(1); + } + + // --- Step 3: Compare the two sets and find the difference --- + const unreferencedFiles = []; + for (const fileId of diskFileIds) { + if (!referencedIds.has(fileId)) { + unreferencedFiles.push(fileId); + } + } + + // --- Step 4: Report the results --- + console.log("\n--- Analysis Complete ---"); + if (unreferencedFiles.length === 0) { + console.log( + "🎉 Excellent! All markdown files in the docs folder are referenced in sidebars.js.", + ); + } else { + console.log( + `⚠️ Found ${unreferencedFiles.length} unreferenced markdown file(s):`, + ); + unreferencedFiles.sort().forEach((file) => { + // Try to find the original extension for a more accurate filename + const ext = + [".md", ".mdx"].find((ext) => { + try { + fs.access(path.join(DOCS_PATH, file + ext)); + return true; + } catch { + return false; + } + }) || ".md"; // Default to .md if check fails + console.log(` - ${file + ext}`); + }); + console.log("\nThese files can either be removed or added to sidebars.js."); + } + console.log("-------------------------"); +} + +// Run the main function +main(); diff --git a/sidebars.js b/sidebars.js index 4b67c7530..60e016f91 100644 --- a/sidebars.js +++ b/sidebars.js @@ -29,7 +29,7 @@ const sidebars = { docSidebar: [ { type: "category", - label: "Introduction", + label: "Overview", //collapsed: false, link: { type: "doc", @@ -40,273 +40,253 @@ const sidebars = { type: "doc", id: "why-timeplus", }, - "showcases", + { + type: "doc", + id: "glossary", + }, + { + type: "doc", + id: "architecture", + }, + { + type: "doc", + id: "showcases", + }, ], }, { - type: "category", + type: "doc", label: "Quickstart", - items: ["quickstart", "proton-howto"], + id: "quickstart", }, { type: "category", - label: "Key Features", - customProps: { tag: "Popular" }, + label: "Guides & Tutorials", items: [ - "stream-query", - "history", - "joins", - "proton-create-view", + "understanding-watermark", + "tutorial-sql-kafka", + "tutorial-github", + "marimo", + "tutorial-sql-connect-kafka", + "tutorial-sql-connect-ch", + "tutorial-cdc-rpcn-pg-to-ch", { - type: "doc", - id: "mutable-stream", - customProps: { tag: "Enterprise" }, + type: "category", + label: "Streaming ETL", + items: [ + "tutorial-sql-etl", + "tutorial-sql-etl-kafka-to-ch", + "tutorial-sql-etl-mysql-to-ch", + ], }, + "tutorial-sql-join", + "tutorial-python-udf", + "sql-pattern-topn", + "usecases", + "tutorial-kv", + "tutorial-sql-read-avro", + "tutorial-testcontainers-java", + ], + }, + { + type: "category", + label: "Core Features", + // customProps: { tag: "Popular" }, + items: [ { type: "category", - label: "External Streams", - collapsed: false, + label: "Streams", link: { type: "doc", - id: "external-stream", + id: "working-with-streams", }, items: [ + "append-stream", + "versioned-stream", + "changelog-stream", { type: "doc", - id: "proton-kafka", - customProps: { tag: "Popular" }, + id: "mutable-stream", + customProps: { tag: "Enterprise" }, }, { - type: "doc", - id: "timeplus-external-stream", - }, - { - type: "doc", - id: "pulsar-external-stream", - }, - { - type: "doc", - id: "http-external", - customProps: { tag: "New" }, + label: "Random Stream", + type: "link", + href: "https://docs.timeplus.com/sql-create-random-stream", }, ], }, { type: "category", - label: "External Tables", - items: [ - "proton-clickhouse-external-table", - { - type: "doc", - id: "mysql-external-table", - customProps: { tag: "New" }, - }, - { - type: "doc", - id: "pg-external-table", - customProps: { tag: "New" }, - }, - { - type: "doc", - id: "s3-external", - customProps: { tag: "New" }, - }, - ], - }, - { - type: "doc", - id: "iceberg", - customProps: { tag: "New" }, - }, - "proton-schema-registry", - "proton-format-schema", - { - type: "doc", - id: "redpanda-connect", - customProps: { tag: "Enterprise" }, - }, - { - label: "Dictionary", - type: "link", - href: "https://docs.timeplus.com/sql-create-dictionary", - customProps: { tag: "Enterprise" }, + label: "Materialized Views", + link: { + type: "doc", + id: "view", + }, + items: ["checkpoint-settings"], }, { type: "category", - label: "User Defined Functions", - collapsed: false, + label: "Data Ingestion", link: { type: "doc", - id: "udf", + id: "ingestion", }, items: [ - "sql-udf", - "remote-udf", - "js-udf", { type: "doc", - id: "py-udf", - customProps: { tag: "New" }, + id: "idempotent", + customProps: { tag: "Enterprise" }, }, ], }, + "destination", { type: "category", - label: "Web Console", - customProps: { tag: "Enterprise" }, + label: "External Streams & Tables", + // link: { + // type: "generated-index", + // title: "SQL Commands", + // description: "Overview of the SQL commands supported by Timeplus.", + // slug: "/category/commands", + // keywords: ["guides"], + // }, items: [ { type: "category", - label: "Getting Data In", - //collapsed: false, + label: "External Streams", link: { type: "doc", - id: "ingestion", + id: "external-stream", }, items: [ - "kafka-source", - "confluent-cloud-source", + { + type: "category", + label: "Apache Kafka", + link: { + type: "doc", + id: "proton-kafka", + }, + items: ["proton-schema-registry", "proton-format-schema"], + }, + { + type: "doc", + id: "pulsar-external-stream", + label: "Apache Pulsar", + }, + { + type: "doc", + id: "timeplus-external-stream", + label: "Remote Timeplus", + }, { type: "doc", - id: "ingest-api", + id: "http-external", + label: "HTTP Write", customProps: { tag: "Enterprise" }, }, + "log-stream", ], }, { - type: "doc", - id: "destination", - }, - { - type: "doc", - label: "Data Visualization", - id: "viz", - }, - { - type: "doc", - id: "alert", + type: "category", + label: "External Tables", + items: [ + { + type: "doc", + id: "proton-clickhouse-external-table", + label: "ClickHouse", + }, + { + type: "doc", + id: "mysql-external-table", + label: "MySQL", + customProps: { tag: "Enterprise" }, + }, + { + type: "doc", + id: "pg-external-table", + label: "PostgreSQL", + customProps: { tag: "Enterprise" }, + }, + { + type: "doc", + id: "mongo-external", + label: "MongoDB", + customProps: { tag: "Enterprise" }, + }, + { + type: "doc", + label: "Amazon S3", + id: "s3-external", + customProps: { tag: "Enterprise" }, + }, + { + type: "doc", + id: "iceberg", + label: "Apache Iceberg", + customProps: { tag: "Enterprise" }, + }, + ], }, ], }, + // "stream-query", + // "history", { - type: "doc", - id: "idempotent", - customProps: { tag: "Enterprise" }, - }, - { - type: "doc", - id: "tiered-storage", + label: "Dictionary", + type: "link", + href: "https://docs.timeplus.com/sql-create-dictionary", customProps: { tag: "Enterprise" }, }, - ], - }, - /*"timeplus-enterprise",*/ - { - type: "category", - label: "Deployment & Operations", - items: [ - "install", - // { - // type: "doc", - // id: "timeplus-cloud", - // customProps: { tag: "Enterprise" }, - // }, { type: "category", - //collapsed: false, - label: "Timeplus Enterprise Self-hosted", - link: { - type: "doc", - id: "timeplus-self-host", - }, + label: "Stream Processing", items: [ - "singlenode_install", - "cluster_install", + "stream-query", + "history", + "joins", + "streaming-windows", + "streaming-aggregations", { type: "doc", - id: "k8s-helm", - customProps: { tag: "Popular" }, + id: "jit", + customProps: { tag: "Enterprise" }, }, ], }, { type: "doc", - id: "server_config", + id: "alert", + customProps: { tag: "Enterprise" }, }, - "proton-ports", { type: "doc", - id: "rbac", + id: "tiered-storage", customProps: { tag: "Enterprise" }, }, - ], - }, - { - type: "category", - label: "Guides & Tutorials", - items: [ - "understanding-watermark", - "tutorial-sql-kafka", - "tutorial-github", - "marimo", - "tutorial-sql-connect-kafka", - "tutorial-sql-connect-ch", - "tutorial-cdc-rpcn-pg-to-ch", { - type: "category", - label: "Streaming ETL", - items: [ - "tutorial-sql-etl", - "tutorial-sql-etl-kafka-to-ch", - "tutorial-sql-etl-mysql-to-ch", - ], + type: "doc", + id: "viz", + customProps: { tag: "Enterprise" }, }, - "tutorial-sql-join", - "tutorial-python-udf", - "sql-pattern-topn", - "usecases", - "tutorial-kv", - "tutorial-sql-read-avro", - "tutorial-testcontainers-java", ], }, { type: "category", - label: "Monitoring & Troubleshooting", - items: [ - "troubleshooting", - "system-stream-state-log", - "system-stream-metric-log", - "prometheus", - ], - }, - { - type: "category", - label: "Open Source", - items: [ - "proton", - "proton-architecture", - "proton-create-stream", - "proton-manage-stream", - "proton-faq", - ], - }, - { - type: "category", - label: "Query & SQL Reference", - customProps: { tag: "Popular" }, + label: "SQL Reference", + // customProps: { tag: "Popular" }, items: [ "query-syntax", "query-settings", - "checkpoint-settings", "datatypes", { type: "category", - label: "SQL Commands", + label: "Statements", customProps: { tag: "Popular" }, link: { type: "generated-index", - title: "SQL Commands", + title: "SQL Statements", description: "Overview of the SQL commands supported by Timeplus.", slug: "/category/commands", keywords: ["guides"], @@ -350,6 +330,7 @@ const sidebars = { "sql-show-functions", "sql-show-streams", "sql-system-pause", + "sql-system-recover", "sql-system-resume", "sql-system-transfer-leader", "sql-truncate-stream", @@ -358,172 +339,242 @@ const sidebars = { }, { type: "category", - label: "Built-in Functions", + label: "Functions", collapsed: true, link: { type: "doc", id: "functions" }, items: [ - "functions_for_type", - "functions_for_comp", - "functions_for_datetime", - "functions_for_url", - "functions_for_json", - "functions_for_text", - "functions_for_hash", - "functions_for_random", + { + type: "category", + label: "Regular Functions", + link: { + type: "generated-index", + title: "Regular Functions", + description: "Integrate Timeplus to your tool stacks.", + slug: "/category/functions", + keywords: ["guides"], + }, + items: [ + "functions_for_type", + "functions_for_comp", + "functions_for_datetime", + "functions_for_url", + "functions_for_json", + "functions_for_text", + "functions_for_hash", + "functions_for_random", + "functions_for_logic", + "functions_for_math", + "functions_for_fin", + "functions_for_geo", + "functions_for_dict", + ], + }, "functions_for_agg", - "functions_for_logic", - "functions_for_math", - "functions_for_fin", - "functions_for_geo", - "functions_for_dict", "functions_for_streaming", + { + type: "category", + label: "User Defined Functions", + collapsed: false, + link: { + type: "doc", + id: "udf", + }, + items: [ + { + type: "doc", + id: "py-udf", + customProps: { tag: "Enterprise" }, + }, + "js-udf", + "sql-udf", + "remote-udf", + ], + }, ], }, "grok", - { - type: "doc", - id: "jit", - customProps: { tag: "Enterprise" }, - }, ], }, { type: "category", - label: "Concepts", - //collapsed: false, + label: "Timeplus Proton (OSS)", link: { type: "doc", - id: "glossary", + id: "proton", }, items: [ { - type: "category", - label: "Stream", - //collapsed: false, - link: { - type: "doc", - id: "working-with-streams", - }, - items: ["changelog-stream", "versioned-stream", "substream"], + type: "doc", + label: "vs. Timeplus Enterprise", + id: "compare", }, - "eventtime", - "view", + "proton-faq", ], }, { type: "category", - label: "Clients, APIs & SDKs", + label: "Integrations", items: [ - "proton-client", + { + type: "category", + label: "CLI, APIs & SDKs", + items: [ + "proton-client", + { + type: "doc", + id: "timeplusd-client", + customProps: { tag: "Enterprise" }, + }, + { + type: "category", + label: "timeplus (CLI)", + customProps: { tag: "Enterprise" }, + link: { + type: "doc", + id: "cli-reference", + }, + items: [ + "cli-backup", + "cli-diag", + "cli-help", + "cli-license", + "cli-migrate", + "cli-restart", + "cli-restore", + "cli-service", + "cli-start", + "cli-status", + "cli-stop", + "cli-sync", + "cli-user", + "cli-version", + ], + }, + "jdbc", + { + label: "ODBC Driver", + type: "link", + href: "https://github.com/timeplus-io/proton-odbc", + }, + "timeplus-connect", + // { + // label: "Python Driver", + // type: "link", + // href: "https://github.com/timeplus-io/proton-python-driver", + // }, + { + label: "Go Driver", + type: "link", + href: "https://github.com/timeplus-io/proton-go-driver", + }, + { + label: "C++ Client", + type: "link", + href: "https://github.com/timeplus-io/timeplus-cpp", + }, + { + label: "Rust Client", + type: "link", + href: "https://crates.io/crates/proton_client", + }, + { + label: "Timeplus REST API", + type: "link", + href: "https://docs.timeplus.com/rest", + }, + "proton-ingest-api", + { + type: "doc", + id: "ingest-api", + customProps: { tag: "Enterprise" }, + }, + { + type: "doc", + id: "query-api", + customProps: { tag: "Enterprise" }, + }, + ], + }, { type: "doc", - id: "timeplusd-client", + id: "redpanda-connect", customProps: { tag: "Enterprise" }, }, { type: "category", - label: "timeplus (CLI)", - customProps: { tag: "Enterprise" }, + label: "Third-party Tools", link: { - type: "doc", - id: "cli-reference", + type: "generated-index", + title: "Third-party Tools", + description: "Integrate Timeplus to your tool stacks.", + slug: "/category/tools", + keywords: ["guides"], }, items: [ - "cli-backup", - "cli-diag", - "cli-help", - "cli-license", - "cli-migrate", - "cli-restart", - "cli-restore", - "cli-service", - "cli-start", - "cli-status", - "cli-stop", - "cli-sync", - "cli-user", - "cli-version", + { + type: "doc", + id: "integration-grafana", + customProps: { tag: "Popular" }, + }, + "sling", + "kafka-connect", + { + label: "Push data to Timeplus via Airbyte", + type: "link", + href: "https://airbyte.com/connectors/timeplus", + }, + { + label: "Push data to Timeplus via Meltano", + type: "link", + href: "https://www.timeplus.com/post/meltano-timeplus-target", + }, + "flyway", + "terraform", ], }, - "jdbc", - { - label: "ODBC Driver", - type: "link", - href: "https://github.com/timeplus-io/proton-odbc", - }, - "timeplus-connect", - // { - // label: "Python Driver", - // type: "link", - // href: "https://github.com/timeplus-io/proton-python-driver", - // }, - { - label: "Go Driver", - type: "link", - href: "https://github.com/timeplus-io/proton-go-driver", - }, - { - label: "C++ Client", - type: "link", - href: "https://github.com/timeplus-io/timeplus-cpp", - }, + ], + }, + { + type: "category", + label: "Deployment & Operations", + items: [ { - label: "Rust Client", - type: "link", - href: "https://crates.io/crates/proton_client", + type: "category", + label: "Timeplus Enterprise Self-hosted", + items: [ + "bare-metal-install", + { + type: "doc", + id: "k8s-helm", + customProps: { tag: "Popular" }, + }, + ], }, { - label: "Timeplus REST API", - type: "link", - href: "https://docs.timeplus.com/rest", + type: "doc", + id: "server_config", }, - "proton-ingest-api", + "proton-ports", { type: "doc", - id: "query-api", + id: "rbac", customProps: { tag: "Enterprise" }, }, ], }, { type: "category", - label: "Third-party Tools", - //collapsed: false, - link: { - type: "generated-index", - title: "Third-party Tools", - description: "Integrate Timeplus to your tool stacks.", - slug: "/category/tools", - keywords: ["guides"], - }, + label: "Monitoring & Troubleshooting", items: [ - { - type: "doc", - id: "integration-grafana", - customProps: { tag: "Popular" }, - }, - // "integration-metabase", - "sling", - "kafka-connect", - { - label: "Push data to Timeplus via Airbyte", - type: "link", - href: "https://airbyte.com/connectors/timeplus", - }, - { - label: "Push data to Timeplus via Meltano", - type: "link", - href: "https://www.timeplus.com/post/meltano-timeplus-target", - }, - "flyway", - "terraform", + "troubleshooting", + "system-stream-state-log", + "system-stream-metric-log", + "prometheus", ], }, - /*"faq" */ { type: "category", label: "Release Notes", - //collapsed: false, link: { type: "doc", id: "release-notes", @@ -552,10 +603,12 @@ const sidebars = { ], }, { - type: "category", - label: "Additional Resources", - items: ["getting-help", "credits"], + type: "doc", + id: "faq", + label: "FAQ", }, + "getting-help", + // "credits", ], }; diff --git a/spellchecker/dic.txt b/spellchecker/dic.txt index 08a4652a2..5f3db1cea 100644 --- a/spellchecker/dic.txt +++ b/spellchecker/dic.txt @@ -9,6 +9,7 @@ 30m 30min 30s +3x 500x1024x1024 _tp_message_key Abiola @@ -69,6 +70,7 @@ auth_params auth_type auto-rebalancing Auto-Rebalancing +autogen AutoMQ autoscaling autoscaling_mv @@ -237,6 +239,7 @@ deployable deploymentSlackWebhook deploymentSlackWebhookHundreds deploymentWebSocket +deserialization deserialize destination-timeplus desynchronized @@ -442,9 +445,11 @@ js-udaf js-udf JSON json +json_array_length json_cast json_encode json_has +json_merge_patch json_query json_value JSONEachRow @@ -467,6 +472,7 @@ kafka_schema_registry_url kcat keep-alives Keycloak +KeyValueService KiB Kinesis kinesis @@ -530,6 +536,7 @@ Minio minio modularized mongodb_throw_on_unsupported_query +monotonicity Monotonicity mouseover MQ @@ -581,10 +588,12 @@ num_of_trips numpy Numpy NVMe +o11y OAuth oauth2 oauth2 OAuth2 +Observability observability OBT odbc @@ -609,6 +618,7 @@ ORM os oss OSS +OTel owlshop-frontend-events p90 P90 @@ -650,6 +660,7 @@ ProduceRequests producerequests productional programmatically +prometheus Protobuf Protobuf-encoded protobuf_complex @@ -724,6 +735,7 @@ S3-based S3-compatible sa-east-1 SaaS +Salla sasl sasl SASL @@ -861,6 +873,7 @@ timeplus timeplus-appserver timeplus-connect timeplus-connector +timeplus-enterprise timeplus-io Timeplus-native-jdbc timeplus-native-jdbc @@ -877,6 +890,7 @@ timeplusd-client timeplusd_address:9363 TimeplusDatabaseMetadata TimeplusSinkConnector +TimeplusStreaming TimeplusWatermarkVisualization timeplusWeb timezones @@ -921,6 +935,7 @@ udaf-example udaf-lifecycle udf UDF +UDFDynamic UDFs UI UI/UX diff --git a/static/img/Option1_W.png b/static/img/Option1_W.png index 18b1227fc..2bfa41770 100644 Binary files a/static/img/Option1_W.png and b/static/img/Option1_W.png differ