7% throughput regression from active sampling when workload repository is enabled #59243
Labels
feature/developing
the related feature is in development
may-affects-5.4
This bug maybe affects 5.4.x versions.
may-affects-6.1
may-affects-6.5
may-affects-7.1
may-affects-7.5
may-affects-8.1
may-affects-8.5
severity/critical
sig/sql-infra
SIG: SQL Infra
type/bug
The issue is confirmed as a bug.
Bug Report
When the workload repository (#58247) is enabled, active sampling of tables appears to introduce a 7% throughput regression in the sysbench OLTP 90:10 read/write workload.
Profiling this revealed that sampling
TIDB_HOT_REGIONS
seems to cause this regression, due to usingListTables
. Specifically,FindTableIndexOfRegion
loops throughis.AllSchemas
andis.SchemaTableInfos
, i.e. an O(N) loop over all database objects:When
TIDB_HOT_REGIONS
is excluded from the list of tables to actively sample for the workload repository, there is no regression in throughput.1. Minimal reproduce step (Required)
sysbench oltp_read_write
for 30 minutes, with the following parameters:--point-selects=9 --range-selects=false --index-updates=0 --non-index-updates=1 --delete-inserts=0 --tables=32 --table-size=10000000 --mysql-ignore-errors=1062,2013,8028,9007 --auto-inc=false
set global tidb_workload_repository_dest="table";
2. What did you expect to see? (Required)
No statistically significant throughput regressions.
3. What did you see instead (Required)
Averaged over 3 runs, the baseline transactions per second (TPS) was 4289.33, and the baseline queries per second (QPS) was 51152.33.
With the workload repository enabled, the average TPS over 3 runs dropped to 3989.67 (-7.24% change), and the average QPS over 3 runs dropped to 47501.67 (-7.40% change).
These measurements were collected by running sysbench on the following topology of Kingsoft Cloud instances:
4. What is your TiDB version? (Required)
This was observed on the master branch as of a3cc774.
The text was updated successfully, but these errors were encountered: