Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor: add more explain analyze info for hash join spill #59255

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

xzhangxian1008
Copy link
Contributor

@xzhangxian1008 xzhangxian1008 commented Feb 5, 2025

What problem does this PR solve?

Issue Number: close #59264

mysql> explain analyze select o_orderkey, exists (select 1 from lineitem where lineitem.l_suppkey = orders.o_custkey) from orders;
+-----------------------------+-------------+----------+-----------+----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------+----------+---------+
| id                          | estRows     | actRows  | task      | access object  | execution info                                                                                                                                                                                                                                                                                                                                                                                                                                                | operator info                                                                                                  | memory   | disk    |
+-----------------------------+-------------+----------+-----------+----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------+----------+---------+
| HashJoin_9                  | 16824768.00 | 15000000 | root      |                | time:21.3s, open:160µs, close:25.9µs, loops:14781, RU:233149.49, build_hash_table:{total:15.9s, fetch:9.08s, build:6.86s}, probe:{concurrency:5, total:19.5s, max:19.5s, probe:2.97s, fetch_and_wait:16.5s, probe_collision:724263}, spill:{round:2, spilled_partition_num_per_round:[8/8 48/64], total_spill_GiB_per_round:[3.83 3.02], build_spill_GiB_per_round:[3.27 2.60]}                                                                               | left outer semi join, left side:TableReader_11, equal:[eq(tpch10.orders.o_custkey, tpch10.lineitem.l_suppkey)] | 200.5 MB | 3.65 GB |
| ├─TableReader_13(Build)     | 59986052.00 | 59986052 | root      |                | time:1.25s, open:75.3µs, close:3.69µs, loops:58659, cop_task: {num: 1605, max: 65.7ms, min: 295µs, avg: 28.3ms, p95: 51.2ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 43.6s, tot_wait: 94.8ms, copr_cache_hit_ratio: 0.00, build_task_duration: 52.4µs, max_distsql_concurrency: 15}, rpc_info:{Cop:{num_rpc:1605, total_time:45.3s}}                                                                                                            | data:TableFullScan_12                                                                                          | 6.14 MB  | N/A     |
| │ └─TableFullScan_12        | 59986052.00 | 59986052 | cop[tikv] | table:lineitem | tikv_task:{proc max:65ms, min:0s, avg: 26.4ms, p80:37ms, p95:48ms, iters:64976, tasks:1605}, scan_detail: {total_process_keys: 59984036, total_process_keys_size: 11762679171, total_keys: 59985640, get_snapshot_time: 42.5ms, rocksdb: {key_skipped_count: 59984036, block: {cache_hit_count: 375717}}}, time_detail: {total_process_time: 43.6s, total_suspend_time: 93.5ms, total_wait_time: 94.8ms, total_kv_read_wall_time: 42.3s, tikv_wall_time: 44s} | keep order:false                                                                                               | N/A      | N/A     |
| └─TableReader_11(Probe)     | 16824768.00 | 15000000 | root      |                | time:335ms, open:74.5µs, close:15.7µs, loops:14667, cop_task: {num: 379, max: 69.2ms, min: 205.2µs, avg: 29.9ms, p95: 50.8ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 10.9s, tot_wait: 17.7ms, copr_cache_hit_ratio: 0.06, build_task_duration: 28.7µs, max_distsql_concurrency: 10}, rpc_info:{Cop:{num_rpc:379, total_time:11.3s}}                                                                                                            | data:TableFullScan_10                                                                                          | 3.07 MB  | N/A     |
|   └─TableFullScan_10        | 16824768.00 | 15000000 | cop[tikv] | table:orders   | tikv_task:{proc max:66ms, min:0s, avg: 27.4ms, p80:38ms, p95:47ms, iters:16165, tasks:379}, scan_detail: {total_process_keys: 14915776, total_process_keys_size: 2264947451, total_keys: 14916131, get_snapshot_time: 6.93ms, rocksdb: {key_skipped_count: 14915776, block: {cache_hit_count: 77142}}}, time_detail: {total_process_time: 10.9s, total_suspend_time: 20.7ms, total_wait_time: 17.7ms, total_kv_read_wall_time: 10.3s, tikv_wall_time: 11s}    | keep order:false                                                                                               | N/A      | N/A     |
+-----------------------------+-------------+----------+-----------+----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------+----------+---------+
5 rows in set (21.27 sec)

Problem Summary:

What changed and how does it work?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 5, 2025
Copy link

tiprow bot commented Feb 5, 2025

Hi @xzhangxian1008. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

codecov bot commented Feb 5, 2025

Codecov Report

Attention: Patch coverage is 15.38462% with 55 lines in your changes missing coverage. Please review.

Project coverage is 73.4542%. Comparing base (c292ec6) to head (8c90bd5).
Report is 23 commits behind head on master.

Current head 8c90bd5 differs from pull request most recent head 2133fe6

Please upload reports for the commit 2133fe6 to get more accurate results.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #59255        +/-   ##
================================================
+ Coverage   73.0409%   73.4542%   +0.4133%     
================================================
  Files          1689       1689                
  Lines        467012     467435       +423     
================================================
+ Hits         341110     343351      +2241     
+ Misses       104920     103106      -1814     
+ Partials      20982      20978         -4     
Flag Coverage Δ
integration 42.8007% <10.7692%> (?)
unit 72.2699% <15.3846%> (+0.0399%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.6910% <ø> (ø)
parser ∅ <ø> (∅)
br 45.4746% <ø> (+0.0315%) ⬆️

@xzhangxian1008
Copy link
Contributor Author

/cc @windtalker @yibin87

@ti-chi-bot ti-chi-bot bot requested review from windtalker and yibin87 February 6, 2025 02:49
@yibin87
Copy link
Contributor

yibin87 commented Feb 7, 2025

please follow the common key:value pattern, and use UnderScoreCase naming method for key names, most execution info also follows json format.

}

round := e.spillHelper.round
for len(e.stats.spill.totalSpillBytesPerRound) < round+1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks a little strange for using "for loop" here, since in the following code, only the last round info are updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks a little strange for using "for loop" here, since in the following code, only the last round info are updated.

okk, I will replace it with if

@@ -202,6 +225,17 @@ func (e *hashJoinRuntimeStatsV2) String() string {
}
buf.WriteString("}")
}
if e.spill.round > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that e.spill.round is very large? If so, I don't think all the data should be printed. Instead, some aggregated info may be more readable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that e.spill.round is very large? If so, I don't think all the data should be printed. Instead, some aggregated info may be more readable.

It will not be very large.

"github.com/pingcap/tidb/pkg/util/execdetails"
)

func convertBytesStatsToString(bytes []int64) string {
info := "["
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better use bytes.NewBuffer instead of raw strings here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better use bytes.NewBuffer instead of raw strings here.

done

@xzhangxian1008
Copy link
Contributor Author

please follow the common key:value pattern, and use UnderScoreCase naming method for key names, most execution info also follows json format.

done

"github.com/pingcap/tidb/pkg/util/execdetails"
)

func writeBytesStatsToString(buf *bytes.Buffer, convertedBytes []int64) string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems no need to return string here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems no need to return string here.

deleted

}

round := e.spillHelper.round
if len(e.stats.spill.totalSpillBytesPerRound) < round+1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be possible that length + N < round here? If so, for loop makes sense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be possible that length + N < round here? If so, for loop makes sense

It's impossible.

Copy link
Contributor

@yibin87 yibin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

ti-chi-bot bot commented Feb 10, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: yibin87
Once this PR has been reviewed and has the lgtm label, please assign time-and-fate for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Feb 10, 2025
Copy link

ti-chi-bot bot commented Feb 10, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-02-10 01:28:40.026168991 +0000 UTC m=+233562.422391053: ☑️ agreed by yibin87.

Copy link

ti-chi-bot bot commented Feb 10, 2025

@xzhangxian1008: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
idc-jenkins-ci-tidb/check_dev 2133fe6 link true /test check-dev
idc-jenkins-ci-tidb/unit-test 2133fe6 link true /test unit-test
idc-jenkins-ci-tidb/check_dev_2 2133fe6 link true /test check-dev2

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-1-more-lgtm Indicates a PR needs 1 more LGTM. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add more explain analyze stats for hash join spill
2 participants