Skip to content

[WIP] add release notes v9.0.0 #20510

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 52 commits into
base: feature/preview-release-notes
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
8322060
Create release-9.0.0.md
hfxsd Mar 12, 2025
860aaf8
performance: update function pushdown
Oreoxmt Mar 24, 2025
fa63356
Apply suggestions from code review
hfxsd Mar 26, 2025
24c29ac
observability: add SQL cross-AZ traffic monitoring
Oreoxmt Mar 26, 2025
d2ae035
update sync-diff-inspector
Oreoxmt Mar 26, 2025
a67f653
add Offline package changes
Oreoxmt Mar 26, 2025
f9c5251
Apply suggestions from code review
hfxsd Mar 26, 2025
86af836
Update releases/release-9.0.0.md
hfxsd Mar 27, 2025
3b1290a
add English translation for 3 features
qiancai Mar 27, 2025
606a417
update the descriptions for two features
qiancai Mar 27, 2025
4712417
update the feature description for #9673
qiancai Mar 28, 2025
119eecf
Update release-9.0.0.md
hfxsd Mar 27, 2025
b139892
Apply suggestions from code review
Oreoxmt Mar 28, 2025
923a72b
Apply suggestions from code review
hfxsd Mar 31, 2025
cb81735
Apply suggestions from code review
hfxsd Apr 7, 2025
cefed67
add compatibility changes
qiancai Apr 7, 2025
7064fcd
Update releases/release-9.0.0.md
hfxsd Apr 8, 2025
462ddaa
add DB operations
Oreoxmt Apr 14, 2025
64be953
update DM log redaction
Oreoxmt Apr 14, 2025
7290c8f
Compatibility changes: add tidb_hash_join_version, hashagg_use_magic_…
Oreoxmt Apr 14, 2025
f0382ae
Apply suggestions from code review
Oreoxmt Apr 15, 2025
2be9e03
minor format updates
qiancai Apr 16, 2025
d19e194
Update a doc link
lilin90 Apr 17, 2025
dfe4353
Revert "Merge remote-tracking branch 'upstream/master' into rn-9.0.0"
hfxsd Apr 21, 2025
90f9998
Reapply "Merge remote-tracking branch 'upstream/master' into rn-9.0.0"
qiancai Apr 21, 2025
8feb43a
Update releases/release-9.0.0.md
hfxsd Apr 21, 2025
94359f9
Apply suggestions from code review
hfxsd Apr 21, 2025
c767416
Update release-9.0.0.md
hfxsd Apr 21, 2025
648384e
docs: add attribution for KNN vector search description (#20809)
shizn Apr 21, 2025
642b4df
cloud: remove outdated cloud roadmap (#20797) (#20798)
ti-chi-bot Apr 21, 2025
621de48
add saas scenario best practices (#20668)
hfxsd Apr 21, 2025
5685cad
tiproxy: add a guide to enable tiproxy using tiup (#20799)
djshow832 Apr 21, 2025
fe33d98
update the default collation of GBK from gbk_bin to gbk_chinese_ci (#…
Oreoxmt Apr 22, 2025
d85c464
release notes: fix version in links (#20835)
hfxsd Apr 23, 2025
6f015e5
Apply suggestions from code review
qiancai Apr 23, 2025
1a43e2c
serverless support import into (#20832)
zeminzhou Apr 23, 2025
ea026d3
Add TiCDC related release notes
benmeadowcroft Apr 26, 2025
2a22190
Add BR/PITR related release notes
benmeadowcroft Apr 26, 2025
f863c16
Updated to add PD release notes, and fix typo
benmeadowcroft Apr 26, 2025
8852fa8
pd configuration: add dashboard.disable-custom-prom-addr (#20853)
qiancai Apr 27, 2025
b44cf62
toc: add TiDB release support policy (#19119) (#20860)
ti-chi-bot Apr 27, 2025
af3e688
releases: add one br entry to v8.1.2 (#20861)
BornChanger Apr 27, 2025
b89e89c
add FAQs about collation for JDBC connections (#20848)
qiancai Apr 28, 2025
36dfd3b
hardware-and-software-requirements: update Kylin Euler to Kylin (#20874)
qiancai Apr 29, 2025
8c30d7f
JDBC URL: update the letter case for defaultFetchSize (#20884)
qiancai Apr 30, 2025
6b8e13b
planner: add doc for `tidb_ignore_inlist_plan_digest`. (#20870)
qiancai Apr 30, 2025
3d67e56
fix(pd): add missing configuration items for PD (#20883)
qiancai Apr 30, 2025
a5b3f19
tikv: recorrect the settings of some configs and supplement missing a…
qiancai Apr 30, 2025
4f18474
planner: add doc for `tidb_ignore_inlist_plan_digest`. (#20870) (#208…
ti-chi-bot Apr 30, 2025
f8ece18
Merge remote-tracking branch 'upstream/master' into rn-9.0.0
hfxsd May 6, 2025
ea60a95
Update releases/release-9.0.0.md
hfxsd May 6, 2025
80a20ff
assign Ben's notes to tw
hfxsd May 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion TOC-tidb-cloud.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@
- [Architecture](/tidb-cloud/tidb-cloud-intro.md#architecture)
- [High Availability](/tidb-cloud/high-availability-with-multi-az.md)
- [MySQL Compatibility](/mysql-compatibility.md)
- [Roadmap](/tidb-cloud/tidb-cloud-roadmap.md)
- Get Started
- [Try Out TiDB Cloud](/tidb-cloud/tidb-cloud-quickstart.md)
- [Try Out TiDB + AI](/vector-search/vector-search-get-started-using-python.md)
Expand Down
2 changes: 2 additions & 0 deletions TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -432,6 +432,7 @@
- [Local Read Under Three Data Centers Deployment](/best-practices/three-dc-local-read.md)
- [Use UUIDs](/best-practices/uuid.md)
- [Read-Only Storage Nodes](/best-practices/readonly-nodes.md)
- [SaaS Multi-Tenant Scenarios](/best-practices/saas-best-practices.md)
- [Use Placement Rules](/configure-placement-rules.md)
- [Use Load Base Split](/configure-load-base-split.md)
- [Use Store Limit](/configure-store-limit.md)
Expand Down Expand Up @@ -1091,6 +1092,7 @@
- [All Releases](/releases/release-notes.md)
- [Release Timeline](/releases/release-timeline.md)
- [TiDB Versioning](/releases/versioning.md)
- [Release Support Policy](https://www.pingcap.com/tidb-release-support-policy/)
- [TiDB Installation Packages](/binary-package.md)
- v8.5
- [8.5.1](/releases/release-8.5.1.md)
Expand Down
6 changes: 0 additions & 6 deletions _docHome.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,12 +38,6 @@ Explore native support of Vector Search in TiDB Cloud Serverless to build your A

</DocHomeCard>

<DocHomeCard href="/tidbcloud/tidb-cloud-roadmap" label="TiDB Cloud Roadmap" icon="cloud-roadmap-mauve">

Planned features and releases for TiDB Cloud.

</DocHomeCard>

</DocHomeCardContainer>

</DocHomeSection>
Expand Down
97 changes: 97 additions & 0 deletions best-practices/saas-best-practices.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
---
title: Best Practices for SaaS Multi-Tenant Scenarios
summary: Learn best practices for TiDB in SaaS (Software as a Service) multi-tenant scenarios, especially for environments where the number of tables in a single cluster exceeds one million.
---

# Best Practices for SaaS Multi-Tenant Scenarios

This document introduces best practices for TiDB in SaaS (Software as a Service) multi-tenant environments, especially in scenarios where the **number of tables in a single cluster exceeds one million**. By making reasonable configurations and choices, you can enable TiDB to run efficiently and stably in SaaS scenarios while reducing resource consumption and costs.

> **Note:**
>
> It is recommended to use TiDB v8.5.0 or later versions.

## TiDB hardware recommendations

It is recommended to use high-memory TiDB instances. For example:

- For one million tables, use 32 GiB or more memory.
- For three million tables, use 64 GiB or more memory.

High-memory TiDB instances allocate more cache space for Infoschema, Statistics, and execution plan caches, thereby improving cache hit rates and consequently enhancing business performance. Larger memory also mitigates performance fluctuations and stability issues caused by TiDB GC.

Recommended hardware configurations for TiKV and PD are as follows:

* TiKV: 8 vCPUs and 32 GiB or more memory.
* PD: 8 CPUs and 16 GiB or more memory.

## Control the number of Regions

If you need to create a large number of tables (for example, more than 100,000), it is recommended to set the TiDB configuration item [`split-table`](/tidb-configuration-file.md#split-table) to `false` to reduce the number of Regions, thus alleviating memory pressure on TiKV.

Check warning on line 30 in best-practices/saas-best-practices.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [PingCAP.Ambiguous] Consider using a clearer word than 'a large number of' because it may cause confusion. Raw Output: {"message": "[PingCAP.Ambiguous] Consider using a clearer word than 'a large number of' because it may cause confusion.", "location": {"path": "best-practices/saas-best-practices.md", "range": {"start": {"line": 30, "column": 23}}}, "severity": "INFO"}

## Configure caches

* Starting from TiDB v8.4.0, TiDB loads table information involved in SQL statements into the Infoschema cache on demand during SQL execution.

- You can monitor the size and hit rate of the Infoschema cache by observing the **Infoschema v2 Cache Size** and **Infoschema v2 Cache Operation** sub-panels under the **Schema Load** panel in TiDB Dashboard.
- You can use the [`tidb_schema_cache_size`](/system-variables.md#tidb_schema_cache_size-new-in-v800) system variable to adjust the memory limit of the Infoschema cache to meet business needs. The size of the Infoschema cache is linearly related to the number of different tables involved in SQL execution. In actual tests, fully caching metadata for one million tables (each with four columns, one primary key, and one index) requires about 2.4 GiB of memory.

* TiDB loads table statistics involved in SQL statements into the Statistics cache on demand during SQL execution.

- You can monitor the size and hit rate of the Statistics cache by observing the **Stats Cache Cost** and **Stats Cache OPS** sub-panels under the **Statistics & Plan Management** panel in TiDB Dashboard.
- You can use the [`tidb_stats_cache_mem_quota`](/system-variables.md#tidb_stats_cache_mem_quota-new-in-v610) system variable to adjust the memory limit of the Statistics cache to meet business needs. In actual tests, executing simple SQL (using the `IndexRangeScan` operator) on 100,000 tables consumes about 3.96 GiB of memory in the Statistics cache.

## Collect statistics

* Starting from TiDB v8.4.0, TiDB introduces the [`tidb_auto_analyze_concurrency`](/system-variables.md#tidb_auto_analyze_concurrency-new-in-v840) system variable to control the number of concurrent auto-analyze operations that can run in a TiDB cluster. In multi-table scenarios, you can increase this concurrency as needed to improve the throughput of automatic analysis. As the concurrency value increases, the throughput and the CPU usage of the TiDB Owner node increase linearly. In actual tests, using a concurrency value of 16 allows automatic analysis of 320 tables (each with 10,000 rows, 4 columns, and 1 index) within one minute, consuming one CPU core of the TiDB Owner node.
* The [`tidb_auto_build_stats_concurrency`](/system-variables.md#tidb_auto_build_stats_concurrency-new-in-v650) and [`tidb_build_sampling_stats_concurrency`](/system-variables.md#tidb_build_sampling_stats_concurrency-new-in-v750) system variables control the concurrency of TiDB statistics construction. You can adjust them based on your scenario:
- For scenarios with many partitioned tables, prioritize increasing the value of `tidb_auto_build_stats_concurrency`.

Check warning on line 48 in best-practices/saas-best-practices.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion. Raw Output: {"message": "[PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion.", "location": {"path": "best-practices/saas-best-practices.md", "range": {"start": {"line": 48, "column": 26}}}, "severity": "INFO"}
- For scenarios with many columns, prioritize increasing the value of `tidb_build_sampling_stats_concurrency`.

Check warning on line 49 in best-practices/saas-best-practices.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion. Raw Output: {"message": "[PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion.", "location": {"path": "best-practices/saas-best-practices.md", "range": {"start": {"line": 49, "column": 26}}}, "severity": "INFO"}
* To avoid excessive resource usage, ensure that the product of `tidb_auto_analyze_concurrency`, `tidb_auto_build_stats_concurrency`, and `tidb_build_sampling_stats_concurrency` does not exceed the number of TiDB CPU cores.

## Query system tables efficiently

When querying system tables, it is recommended to add filters such as `TABLE_SCHEMA`, `TABLE_NAME`, or `TIDB_TABLE_ID` to avoid scanning a large amount of irrelevant data. This improves query speed and reduces resource consumption.

For example, in a scenario with three million tables:

- Executing the following SQL statement consumes about 8 GiB of memory.

```sql
SELECT COUNT(*) FROM information_schema.tables;
```

- Executing the following SQL statement takes about 20 minutes.

```sql
SELECT COUNT(*) FROM information_schema.views;
```

By adding appropriate filter conditions to the preceding SQL statements, memory consumption becomes negligible, and query time is reduced to milliseconds.

## Handle connection-intensive scenarios

In SaaS multi-tenant scenarios, each user usually connects to TiDB to operate data in their own tenant (database). To support a high number of connections:

* Increase the TiDB configuration item [`token-limit`](/tidb-configuration-file.md#token-limit) (`1000` by default) to support more concurrent requests.
* The memory usage of TiDB is roughly linear with the number of connections. In actual tests, 200,000 idle connections increase TiDB memory usage by about 30 GiB. It is recommended to increase TiDB memory specifications based on actual connection numbers.
* If you use `PREPARED` statements, each connection maintains a session-level Prepared Plan Cache. If the `DEALLOCATE` statement is not executed for a long time, the cache might accumulate too many plans, increasing memory usage. In actual tests, 400,000 execution plans involving `IndexRangeScan` consume approximately 5 GiB of memory. It is recommended to increase memory specifications accordingly.

Check warning on line 78 in best-practices/saas-best-practices.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion. Raw Output: {"message": "[PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion.", "location": {"path": "best-practices/saas-best-practices.md", "range": {"start": {"line": 78, "column": 194}}}, "severity": "INFO"}

## Use stale read carefully

When you use [Stale Read](/stale-read.md), an outdated schema version might trigger a full load of historical schemas, which can significantly impact performance. To mitigate this issue, increase the value of [`tidb_schema_version_cache_limit`](/system-variables.md#tidb_schema_version_cache_limit-new-in-v740) (for example, to `255`).

## Optimize BR backup and restore

* When restoring a full backup with millions of tables, it is recommended to use high-memory BR instances. For example:
- For one million tables, use BR instances with 32 GiB or more memory.
- For three million tables, use BR instances with 64 GiB or more memory.
* BR log backup and snapshot restore consume additional TiKV memory. It is recommended to use TiKV instances with 32 GiB or more memory.
* Adjust BR configurations [`pitr-batch-count` and `pitr-concurrency`](/br/use-br-command-line-tool.md#common-options) as needed to improve log restore speed.

## Import data with TiDB Lightning

When importing millions of tables using [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md), follow these recommendations:

- For large tables (over 100 GiB), use TiDB Lightning [physical import mode](/tidb-lightning/tidb-lightning-physical-import-mode.md).
- For small tables (typically numerous in quantity), use TiDB Lightning [logical import mode](/tidb-lightning/tidb-lightning-logical-import-mode.md).
2 changes: 1 addition & 1 deletion character-set-and-collation.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ SHOW CHARACTER SET;
+---------+-------------------------------------+-------------------+--------+
| ascii | US ASCII | ascii_bin | 1 |
| binary | binary | binary | 1 |
| gbk | Chinese Internal Code Specification | gbk_bin | 2 |
| gbk | Chinese Internal Code Specification | gbk_chinese_ci | 2 |
| latin1 | Latin1 | latin1_bin | 1 |
| utf8 | UTF-8 Unicode | utf8_bin | 3 |
| utf8mb4 | UTF-8 Unicode | utf8mb4_bin | 4 |
Expand Down
44 changes: 10 additions & 34 deletions character-set-gbk.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,9 @@ summary: This document provides details about the TiDB support of the GBK charac

# GBK

Since v5.4.0, TiDB supports the GBK character set. This document provides the TiDB support and compatibility information of the GBK character set.
Starting from v5.4.0, TiDB supports the GBK character set. This document provides the TiDB support and compatibility information of the GBK character set.

Starting from v6.0.0, TiDB enables the [new framework for collations](/character-set-and-collation.md#new-framework-for-collations) by default. The default collation for TiDB GBK character set is `gbk_chinese_ci`, which is consistent with MySQL.

```sql
SHOW CHARACTER SET WHERE CHARSET = 'gbk';
Expand All @@ -15,7 +17,7 @@ SHOW CHARACTER SET WHERE CHARSET = 'gbk';
+---------+-------------------------------------+-------------------+--------+
| Charset | Description | Default collation | Maxlen |
+---------+-------------------------------------+-------------------+--------+
| gbk | Chinese Internal Code Specification | gbk_bin | 2 |
| gbk | Chinese Internal Code Specification | gbk_chinese_ci | 2 |
+---------+-------------------------------------+-------------------+--------+
1 row in set (0.00 sec)
```
Expand All @@ -40,48 +42,22 @@ This section provides the compatibility information between MySQL and TiDB.

### Collations

The default collation of the GBK character set in MySQL is `gbk_chinese_ci`. Unlike MySQL, the default collation of the GBK character set in TiDB is `gbk_bin`. Additionally, because TiDB converts GBK to `utf8mb4` and then uses a binary collation, the `gbk_bin` collation in TiDB is not the same as the `gbk_bin` collation in MySQL.

<CustomContent platform="tidb">

To make TiDB compatible with the collations of MySQL GBK character set, when you first initialize the TiDB cluster, you need to set the TiDB option [`new_collations_enabled_on_first_bootstrap`](/tidb-configuration-file.md#new_collations_enabled_on_first_bootstrap) to `true` to enable the [new framework for collations](/character-set-and-collation.md#new-framework-for-collations). This is the default setting for new deployments.
The default collation of the GBK character set in MySQL is `gbk_chinese_ci`. The default collation for the GBK character set in TiDB depends on the value of the TiDB configuration item [`new_collations_enabled_on_first_bootstrap`](/tidb-configuration-file.md#new_collations_enabled_on_first_bootstrap):

- By default, the TiDB configuration item [`new_collations_enabled_on_first_bootstrap`](/tidb-configuration-file.md#new_collations_enabled_on_first_bootstrap) is set to `true`, which means that the [new framework for collations](/character-set-and-collation.md#new-framework-for-collations) is enabled and the default collation for the GBK character set is `gbk_chinese_ci`.
- When the TiDB configuration item [`new_collations_enabled_on_first_bootstrap`](/tidb-configuration-file.md#new_collations_enabled_on_first_bootstrap) is set to `false`, the [new framework for collations](/character-set-and-collation.md#new-framework-for-collations) is disabled, and the default collation for the GBK character set is `gbk_bin`.

</CustomContent>

<CustomContent platform="tidb-cloud">

To make TiDB compatible with the collations of MySQL GBK character set, when you first initialize the TiDB cluster, TiDB Cloud enables the [new framework for collations](/character-set-and-collation.md#new-framework-for-collations) by default.
By default, TiDB Cloud enables the [new framework for collations](/character-set-and-collation.md#new-framework-for-collations) and the default collation for the GBK character set is `gbk_chinese_ci`.

</CustomContent>

After enabling the new framework for collations, if you check the collations corresponding to the GBK character set, you can see that the TiDB GBK default collation is changed to `gbk_chinese_ci`.

```sql
SHOW CHARACTER SET WHERE CHARSET = 'gbk';
```

```
+---------+-------------------------------------+-------------------+--------+
| Charset | Description | Default collation | Maxlen |
+---------+-------------------------------------+-------------------+--------+
| gbk | Chinese Internal Code Specification | gbk_chinese_ci | 2 |
+---------+-------------------------------------+-------------------+--------+
1 row in set (0.00 sec)
```

```sql
SHOW COLLATION WHERE CHARSET = 'gbk';
```

```
+----------------+---------+----+---------+----------+---------+---------------+
| Collation | Charset | Id | Default | Compiled | Sortlen | Pad_attribute |
+----------------+---------+----+---------+----------+---------+---------------+
| gbk_bin | gbk | 87 | | Yes | 1 | PAD SPACE |
| gbk_chinese_ci | gbk | 28 | Yes | Yes | 1 | PAD SPACE |
+----------------+---------+----+---------+----------+---------+---------------+
2 rows in set (0.00 sec)
```
Additionally, because TiDB converts GBK to `utf8mb4` and then uses a binary collation, the `gbk_bin` collation in TiDB is not the same as the `gbk_bin` collation in MySQL.

### Illegal character compatibility

Expand Down
16 changes: 15 additions & 1 deletion develop/dev-guide-sample-application-java-jdbc.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,23 @@ In this tutorial, you can learn how to use TiDB and JDBC to accomplish the follo
- Connect to your TiDB cluster using JDBC.
- Build and run your application. Optionally, you can find [sample code snippets](#sample-code-snippets) for basic CRUD operations.

<CustomContent platform="tidb">

> **Note:**
>
> This tutorial works with TiDB Cloud Serverless, TiDB Cloud Dedicated, and TiDB Self-Managed.
> - This tutorial works with TiDB Cloud Serverless, TiDB Cloud Dedicated, and TiDB Self-Managed.
> - Starting from TiDB v7.4, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `UTF-8` in the JDBC URL, the collation used in a JDBC connection depends on the JDBC driver version. For more information, see [Collation used in JDBC connections](/faq/sql-faq.md#collation-used-in-jdbc-connections).

</CustomContent>

<CustomContent platform="tidb-cloud">

> **Note:**
>
> - This tutorial works with TiDB Cloud Serverless, TiDB Cloud Dedicated, and TiDB Self-Managed.
> - Starting from TiDB v7.4, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `UTF-8` in the JDBC URL, the collation used in a JDBC connection depends on the JDBC driver version. For more information, see [Collation used in JDBC connections](https://docs.pingcap.com/tidb/stable/sql-faq#collation-used-in-jdbc-connections).

</CustomContent>

## Prerequisites

Expand Down
Loading
Loading