Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table collation doesn't follow schema #58446

Open
dveeden opened this issue Dec 21, 2024 · 1 comment · May be fixed by #58582
Open

Table collation doesn't follow schema #58446

dveeden opened this issue Dec 21, 2024 · 1 comment · May be fixed by #58582
Labels
affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. compatibility-mysql8 This is a compatibility issue with MySQL 8.0(but NOT 5.7) component/ddl This issue is related to DDL of TiDB. severity/major type/bug The issue is confirmed as a bug. type/compatibility

Comments

@dveeden
Copy link
Contributor

dveeden commented Dec 21, 2024

Bug Report

When creating a table the collation should be the collation from the schema if it is not explicitly set for the table. This doesn't seem to be the case with TiDB if the character set is specified explicitly, but the collation is not.

1. Minimal reproduce step (Required)

CREATE SCHEMA s1 DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
USE s1;
CREATE TABLE t11 (id INT PRIMARY KEY);
CREATE TABLE t12 (id INT PRIMARY KEY) CHARACTER SET utf8;
CREATE TABLE t13 (id INT PRIMARY KEY) CHARACTER SET utf8 COLLATE utf8_general_ci;
SELECT TABLE_NAME, TABLE_COLLATION FROM information_schema.tables WHERE TABLE_SCHEMA='s1';

2. What did you expect to see? (Required)

MySQL 9.1.0:

mysql-9.1.0> CREATE SCHEMA s1 DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
Query OK, 1 row affected, 2 warnings (0.01 sec)

mysql-9.1.0> USE s1;
Database changed
mysql-9.1.0> CREATE TABLE t11 (id INT PRIMARY KEY);
Query OK, 0 rows affected (0.01 sec)

mysql-9.1.0> CREATE TABLE t12 (id INT PRIMARY KEY) CHARACTER SET utf8;
Query OK, 0 rows affected, 1 warning (0.02 sec)

mysql-9.1.0> CREATE TABLE t13 (id INT PRIMARY KEY) CHARACTER SET utf8 COLLATE utf8_general_ci;
Query OK, 0 rows affected, 2 warnings (0.01 sec)

mysql-9.1.0> SELECT TABLE_NAME, TABLE_COLLATION FROM information_schema.tables WHERE TABLE_SCHEMA='s1';
+------------+--------------------+
| TABLE_NAME | TABLE_COLLATION    |
+------------+--------------------+
| t11        | utf8mb3_general_ci |
| t12        | utf8mb3_general_ci |
| t13        | utf8mb3_general_ci |
+------------+--------------------+
3 rows in set (0.00 sec)

3. What did you see instead (Required)

TiDB v8.5.0

mysql-8.0.11-TiDB-v8.5.0> CREATE SCHEMA s1 DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
Query OK, 0 rows affected (0.05 sec)

mysql-8.0.11-TiDB-v8.5.0> USE s1;
Database changed
mysql-8.0.11-TiDB-v8.5.0> CREATE TABLE t11 (id INT PRIMARY KEY);
Query OK, 0 rows affected (0.04 sec)

mysql-8.0.11-TiDB-v8.5.0> CREATE TABLE t12 (id INT PRIMARY KEY) CHARACTER SET utf8;
Query OK, 0 rows affected (0.04 sec)

mysql-8.0.11-TiDB-v8.5.0> CREATE TABLE t13 (id INT PRIMARY KEY) CHARACTER SET utf8 COLLATE utf8_general_ci;
Query OK, 0 rows affected (0.04 sec)

mysql-8.0.11-TiDB-v8.5.0> SELECT TABLE_NAME, TABLE_COLLATION FROM information_schema.tables WHERE TABLE_SCHEMA='s1';
+------------+-----------------+
| TABLE_NAME | TABLE_COLLATION |
+------------+-----------------+
| t11        | utf8_general_ci |
| t12        | utf8_bin        |
| t13        | utf8_general_ci |
+------------+-----------------+
3 rows in set (0.00 sec)

4. What is your TiDB version? (Required)

Release Version: v8.5.0
Edition: Community
Git Commit Hash: d13e52ed6e22cc5789bed7c64c861578cd2ed55b
Git Branch: HEAD
UTC Build Time: 2024-12-18 02:26:06
GoVersion: go1.23.3
Race Enabled: false
Check Table Before Drop: false
Store: tikv
@dveeden dveeden added compatibility-mysql8 This is a compatibility issue with MySQL 8.0(but NOT 5.7) type/bug The issue is confirmed as a bug. type/compatibility component/executor component/ddl This issue is related to DDL of TiDB. and removed component/executor labels Dec 21, 2024
@dveeden dveeden added affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. and removed may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.5 may-affects-7.5 may-affects-8.1 may-affects-8.5 may-affects-7.1 labels Dec 24, 2024
@dveeden dveeden added the affects-7.1 This bug affects the 7.1.x(LTS) versions. label Dec 24, 2024
@dveeden dveeden linked a pull request Dec 27, 2024 that will close this issue
13 tasks
@dveeden
Copy link
Contributor Author

dveeden commented Dec 30, 2024

The behavior for the utf8mb4 character set is more complex due to default_collation_for_utf8mb4.

mysql-9.1.0> \W
Show warnings enabled.
mysql-9.1.0> CREATE SCHEMA s1 COLLATE utf8mb4_unicode_ci;
Query OK, 1 row affected (0.02 sec)

mysql-9.1.0> SET SESSION default_collation_for_utf8mb4='utf8mb4_general_ci';
Query OK, 0 rows affected, 1 warning (0.00 sec)

Warning (Code 1681): Updating 'default_collation_for_utf8mb4' is deprecated. It will be made read-only in a future release.
mysql-9.1.0> USE s1
Database changed
mysql-9.1.0> CREATE TABLE t1 (id INT PRIMARY KEY);
Query OK, 0 rows affected (0.02 sec)

mysql-9.1.0> CREATE TABLE t2 (id INT PRIMARY KEY) DEFAULT CHARACTER SET utf8mb4;
Query OK, 0 rows affected (0.02 sec)

mysql-9.1.0> SHOW VARIABLES LIKE 'collation%';
+----------------------+--------------------+
| Variable_name        | Value              |
+----------------------+--------------------+
| collation_connection | utf8mb4_0900_ai_ci |
| collation_database   | utf8mb4_unicode_ci |
| collation_server     | utf8mb4_0900_ai_ci |
+----------------------+--------------------+
3 rows in set (0.00 sec)

mysql-9.1.0> SELECT TABLE_NAME, TABLE_COLLATION FROM information_schema.tables WHERE TABLE_SCHEMA='s1';
+------------+--------------------+
| TABLE_NAME | TABLE_COLLATION    |
+------------+--------------------+
| t1         | utf8mb4_unicode_ci |
| t2         | utf8mb4_general_ci |
+------------+--------------------+
2 rows in set (0.00 sec)

mysql-9.1.0> SET SESSION default_collation_for_utf8mb4='utf8mb4_0900_ai_ci';
Query OK, 0 rows affected, 1 warning (0.00 sec)

Warning (Code 1681): Updating 'default_collation_for_utf8mb4' is deprecated. It will be made read-only in a future release.
mysql-9.1.0> CREATE TABLE t3 (id INT PRIMARY KEY) DEFAULT CHARACTER SET utf8mb4;
Query OK, 0 rows affected (0.02 sec)

mysql-9.1.0> SELECT TABLE_NAME, TABLE_COLLATION FROM information_schema.tables WHERE TABLE_SCHEMA='s1';
+------------+--------------------+
| TABLE_NAME | TABLE_COLLATION    |
+------------+--------------------+
| t1         | utf8mb4_unicode_ci |
| t2         | utf8mb4_general_ci |
| t3         | utf8mb4_0900_ai_ci |
+------------+--------------------+
3 rows in set (0.00 sec)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. compatibility-mysql8 This is a compatibility issue with MySQL 8.0(but NOT 5.7) component/ddl This issue is related to DDL of TiDB. severity/major type/bug The issue is confirmed as a bug. type/compatibility
Projects
None yet
2 participants