TP queries can get lots of performance benefits from Plan Cache, but the current TiDB Plan Cache can only support Prepare/Execute Statements. Most of our users are still using general(non-prepared) queries to access TiDB instead of Prepare/Execute. So if Plan Cache can support general queries, these users can gain some performance benefits as well. Some other mature Databases have already supported this feature: 1) Oracle and 2) OceanBase.
As the left part of the above diagram show, Plan Cache can be split into 2 parts logically:
- The Frontend: this part is used to deal with different protocols (binary or text), and prepare all arguments required by the backend (SQL, AST, Normalized Parameters, etc.)
- The Backend: find an available cached Plan from the cache or generate a new Plan for the input arguments
And more specifically, the backend can be abstracted as a function GetPlan(SQL, AST, Params)
, and it works as below:
- Look up the Plan Cache to find a cached plan for the input SQL;
- If it can find an available plan, fill this plan with the input parameters, and return the plan; (cache hit)
- If not, pass
(AST, Params)
to the optimizer to generate a new plan, return this plan, and put this plan into the cache; (cache miss)
To support Non-Prepared Plan Cache, logically we can re-use the whole backend, and just update the frontend a little:
- Check whether the current query is supported by Non-Prepared Plan Cache;
- If not, just fall back to the normal optimization code path;
- If yes:
- Normalize this query and prepare all arguments
(SQL, AST, Params)
; - Pass these arguments to the Plan Cache backend to get a Plan;
- Normalize this query and prepare all arguments
Currently, TiDB only caches one plan for a parameterized query.
For example, the queries SELECT * FROM t WHERE a < 1
and SELECT * FROM t WHERE a < 100000
share the same parameterized form, SELECT * FROM t WHERE a < ?
, and thus share the same plan.
Due to the preceding risks and the fact that the execution plan cache only provides significant benefits for simple queries (if a query is complex and takes a long time to execute, using the execution plan cache might not be very helpful), TiDB has strict restrictions on the scope of non-prepared plan cache. The restrictions are as follows:
- Queries or plans that are not supported by the Prepared plan cache are also not supported by the non-prepared plan cache.
- Currently, only point get or range queries on a single table that contain
Scan
,Selection
, orProjection
operators are supported, such asSELECT * FROM t WHERE a < 10 AND b in (1, 2)
. - Queries that contain complex operators such as
Agg
,Limit
,Window
, orSort
are not supported. - Queries that contain non-range query conditions are not supported, such as:
LIKE
is not supported, such asc LIKE 'c%'
.+
operation is not supported, such asa+1 < 2
.- Queries that filter on columns of
JSON
,ENUM
,SET
, orBIT
type are not supported, such asSELECT * FROM t WHERE json_col = '{}'
. - Queries that filter on
NULL
values are not supported, such asSELECT * FROM t WHERE a is NULL
. - Queries with more than 50 parameters after parameterization are not supported, such as
SELECT * FROM t WHERE a in (1, 2, 3, ... 51)
. - Queries that access partitioned tables, virtual columns, temporary tables, views, or memory tables are not supported, such as
SELECT * FROM INFORMATION_SCHEMA.COLUMNS
, whereCOLUMNS
is a TiDB memory table. - Queries with hints, subqueries, or locks are not supported.
- DML statements are not supported.
We'll solve these limitations step by step later on.
Prepared Plan Cache and Non-Prepared Plan Cache are totally separated, cached plans in them do not affect each other.
We implement a cacheable checker to check whether a query can be supported by Non-Prepared Plan Cache(cacheable_checker). If a query is not supported, we can just fall back to the normal optimization code path.
Plan Cache uses parameterized SQL as the key, so we need a way that can parameterize any SQL fast.
For example, parameterize select * from t where a<10
to select * from t where a<?
;
Currently, we implement this on AST
:
- First, we iterate the
AST
and replace all constant values with?
; - Then, we call
AST.Restore
to get the parameterized SQL;
Introduce 2 variables below to control Non-Prepared Plan Cache behaviors:
tidb_enable_non_prepared_plan_cache
: indicate whether to enable this feature;tidb_non_prepared_plan_cache_size
: indicate the cache size;