Skip to content

metrics: expose IA remote-read scan details#69606

Open
Connor1996 wants to merge 1 commit into
pingcap:masterfrom
Connor1996:codex/tidb-ia-remote-read-observability
Open

metrics: expose IA remote-read scan details#69606
Connor1996 wants to merge 1 commit into
pingcap:masterfrom
Connor1996:codex/tidb-ia-remote-read-observability

Conversation

@Connor1996

@Connor1996 Connor1996 commented Jul 2, 2026

Copy link
Copy Markdown
Member

What problem does this PR solve?

Issue Number: ref #69624

Problem Summary:

TiDB does not currently expose IA remote-read scan detail fields in its own observability surfaces, so operators cannot inspect remote segment count, bytes, or wait time through slow logs, statement summary, or Prometheus metrics.

What changed and how does it work?

Adds TiDB observability plumbing for IA remote-read scan details:

  • slow log fields for IA remote segment count, bytes, and wait time
  • INFORMATION_SCHEMA.SLOW_QUERY and statement summary columns
  • Prometheus metrics and metric-schema entries
  • statement summary aggregation for average/max IA remote-read values

This intentionally reads the new client-go util.ScanDetail fields directly instead of using reflection compatibility code.

This PR depends on tikv/client-go#2015 so TiDB can bump github.com/tikv/client-go/v2 to a version containing IaRemoteReadSegment* fields. The PR intentionally does not commit a temporary fork or local replace.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

Expose IA remote-read segment count, bytes, and wait time in slow logs, statement summary, `INFORMATION_SCHEMA.SLOW_QUERY`, and Prometheus metrics.

Summary by CodeRabbit

  • New Features

    • Added new IA remote read metrics to slow query output, statement summaries, and system metrics views.
    • Expanded query and statement summary data to include segment count, segment size, and wait time.
  • Bug Fixes

    • Updated parsed output and column alignment so the new metrics appear in the correct positions.
    • Ensured metric values are recorded and displayed consistently across summaries, logs, and monitoring.

@ti-chi-bot

ti-chi-bot Bot commented Jul 2, 2026

Copy link
Copy Markdown

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot Bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jul 2, 2026
@coderabbitai

coderabbitai Bot commented Jul 2, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds IA (remote) read segment metrics—count, size, and wait time—end-to-end: a new extraction helper in execdetails, Prometheus counters/histogram, slow query log fields and columns, and statement summary (v1/v2) aggregation, along with accompanying schema updates and tests.

Changes

IA Remote Read Segment Metrics

Layer / File(s) Summary
Core stats extraction helper
pkg/util/execdetails/execdetails.go, pkg/util/execdetails/execdetails_test.go
Adds IARemoteReadSegmentStats struct, field-name constants, and GetIARemoteReadSegmentStats helper extracting count/bytes/wait time from util.ScanDetail, with unit test coverage.
Prometheus metrics and connection reporting
pkg/metrics/server.go, pkg/metrics/metrics.go, pkg/infoschema/metric_table_def.go, pkg/server/conn.go
Declares and registers new counters/histogram for IA remote read segments, adds metric table map entries, and wires addQueryMetrics to emit these metrics using the stats helper.
Slow query log integration
pkg/sessionctx/variable/slow_log.go, pkg/sessionctx/variable/tests/session_test.go, pkg/executor/slow_query.go, pkg/executor/slow_query_test.go, pkg/infoschema/tables.go (slowQueryCols), pkg/infoschema/test/clustertablestest/tables_test.go
Writes IA remote read segment fields to slow logs when non-zero, parses them back as uint/float columns, extends the slow query schema, and updates fixtures/tests for the shifted columns.
Statement summary v1 aggregation
pkg/util/stmtsummary/statement_summary.go, pkg/util/stmtsummary/evicted.go, pkg/util/stmtsummary/reader.go, pkg/infoschema/tables.go (statementsSummaryCols), pkg/util/stmtsummary/statement_summary_test.go, pkg/util/stmtsummary/BUILD.bazel
Adds sum/max IA fields to stmtSummaryStats, aggregates and merges them in evicted summaries, exposes new average/max reader columns, and extends the statement summary schema with tests.
Statement summary v2 aggregation
pkg/util/stmtsummary/v2/record.go, pkg/util/stmtsummary/v2/column.go, pkg/util/stmtsummary/v2/column_test.go, pkg/util/stmtsummary/v2/BUILD.bazel
Adds IA sum/max fields to StmtRecord, updates Add/Merge logic, exposes matching v2 column constants/factories, and adds round-trip tests.

Estimated code review effort: 3 (Moderate) | ~30 minutes

Sequence Diagram(s)

sequenceDiagram
  participant Conn as conn.go addQueryMetrics
  participant ExecDetails as execdetails.GetIARemoteReadSegmentStats
  participant Metrics as IARemoteReadSegmentCount/Size/WaitDuration

  Conn->>ExecDetails: extract stats from ScanDetail
  ExecDetails-->>Conn: IARemoteReadSegmentStats
  Conn->>Metrics: observe count/size/wait duration
Loading

Suggested reviewers: yudongusa, gengliqi, cfzjywxk

Poem

Segments hop, remote and near,
Counts and bytes I gather here,
Wait times ticked in histogram beats,
Slow logs sing of data feats,
A rabbit's metrics, sharp and clear! 🐇📊

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title is concise and matches the main change, though it emphasizes metrics more than the full observability scope.
Description check ✅ Passed The description follows the template well, with issue number, problem summary, change summary, checklist, and release note present.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@ti-chi-bot

ti-chi-bot Bot commented Jul 2, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign gmhdbjd, xuhuaiyu for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jul 2, 2026
@Connor1996 Connor1996 force-pushed the codex/tidb-ia-remote-read-observability branch from cbda601 to 53462ff Compare July 3, 2026 00:06
@ti-chi-bot ti-chi-bot Bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 3, 2026
@Connor1996 Connor1996 force-pushed the codex/tidb-ia-remote-read-observability branch from 53462ff to 9506fba Compare July 3, 2026 00:09
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
@Connor1996 Connor1996 force-pushed the codex/tidb-ia-remote-read-observability branch from 9506fba to d136bb6 Compare July 3, 2026 00:49
@Connor1996 Connor1996 marked this pull request as ready for review July 3, 2026 00:49
@ti-chi-bot ti-chi-bot Bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/needs-triage-completed and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/needs-linked-issue do-not-merge/needs-tests-checked do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jul 3, 2026
@ti-chi-bot

ti-chi-bot Bot commented Jul 3, 2026

Copy link
Copy Markdown

@Connor1996: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-lightning-integration-test d136bb6 link true /test pull-lightning-integration-test
idc-jenkins-ci-tidb/mysql-test d136bb6 link true /test mysql-test
idc-jenkins-ci-tidb/build d136bb6 link true /test build
idc-jenkins-ci-tidb/check_dev_2 d136bb6 link true /test check-dev2
pull-integration-e2e-test d136bb6 link true /test pull-integration-e2e-test
pull-br-integration-test d136bb6 link true /test pull-br-integration-test
idc-jenkins-ci-tidb/check_dev d136bb6 link true /test check-dev
pull-mysql-client-test d136bb6 link true /test pull-mysql-client-test
pull-mysql-client-test-next-gen d136bb6 link true /test pull-mysql-client-test-next-gen
pull-integration-realcluster-test-next-gen d136bb6 link true /test pull-integration-realcluster-test-next-gen
pull-build-next-gen d136bb6 link true /test pull-build-next-gen
pull-unit-test-next-gen d136bb6 link true /test pull-unit-test-next-gen
idc-jenkins-ci-tidb/unit-test d136bb6 link true /test unit-test

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/util/execdetails/execdetails.go`:
- Around line 213-231: The new IA remote-read fields are not available in the
current client-go version, so the added accessor in GetIARemoteReadSegmentStats
will not compile yet. Hold this change until the github.com/tikv/client-go/v2
bump lands, or guard/remove the references to
util.ScanDetail.IaRemoteReadSegmentCount, IaRemoteReadSegmentBytes, and
IaRemoteReadSegmentDuration so this package and its tests only use fields that
exist in the current dependency.

In `@pkg/util/stmtsummary/statement_summary.go`:
- Around line 202-207: The field naming is inconsistent across the statement
summary stats, where sumIAReadSegmentCount/maxIAReadSegmentCount omit “Remote”
while the related fields and upstream source use IARemoteReadSegment/
IaRemoteReadSegmentCount. Rename the struct fields in statement_summary.go and
update all matching references in reader.go and tables.go/v2 so the exported SQL
column becomes AVG_IA_REMOTE_READ_SEGMENT_COUNT (and related names stay
aligned). Use the existing sibling symbols like sumIARemoteReadSegmentSize,
maxIARemoteReadSegmentWaitTime, and GetIARemoteReadSegmentStats as the naming
reference when updating the public INFORMATION_SCHEMA.STATEMENTS_SUMMARY
columns.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 9766269a-88a6-49f6-bd9b-cf863d6f288b

📥 Commits

Reviewing files that changed from the base of the PR and between b212b93 and d136bb6.

📒 Files selected for processing (21)
  • pkg/executor/slow_query.go
  • pkg/executor/slow_query_test.go
  • pkg/infoschema/metric_table_def.go
  • pkg/infoschema/tables.go
  • pkg/infoschema/test/clustertablestest/tables_test.go
  • pkg/metrics/metrics.go
  • pkg/metrics/server.go
  • pkg/server/conn.go
  • pkg/sessionctx/variable/slow_log.go
  • pkg/sessionctx/variable/tests/session_test.go
  • pkg/util/execdetails/execdetails.go
  • pkg/util/execdetails/execdetails_test.go
  • pkg/util/stmtsummary/BUILD.bazel
  • pkg/util/stmtsummary/evicted.go
  • pkg/util/stmtsummary/reader.go
  • pkg/util/stmtsummary/statement_summary.go
  • pkg/util/stmtsummary/statement_summary_test.go
  • pkg/util/stmtsummary/v2/BUILD.bazel
  • pkg/util/stmtsummary/v2/column.go
  • pkg/util/stmtsummary/v2/column_test.go
  • pkg/util/stmtsummary/v2/record.go

Comment on lines +213 to +231
// IARemoteReadSegmentStats contains IA remote-read scan statistics.
type IARemoteReadSegmentStats struct {
Count uint64
Bytes uint64
WaitTime time.Duration
}

// GetIARemoteReadSegmentStats reads IA remote-read scan statistics from client-go ScanDetail.
func GetIARemoteReadSegmentStats(scanDetail *util.ScanDetail) IARemoteReadSegmentStats {
if scanDetail == nil {
return IARemoteReadSegmentStats{}
}
return IARemoteReadSegmentStats{
Count: scanDetail.IaRemoteReadSegmentCount,
Bytes: scanDetail.IaRemoteReadSegmentBytes,
WaitTime: scanDetail.IaRemoteReadSegmentDuration,
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '\n== go.mod client-go dependency ==\n'
grep -nE 'tikv/client-go|client-go' go.mod || true

printf '\n== execdetails.go relevant lines ==\n'
sed -n '180,260p' pkg/util/execdetails/execdetails.go

printf '\n== ScanDetail field references in repo ==\n'
rg -n 'IaRemoteReadSegment|IARemoteReadSegment' -S .

printf '\n== client-go module cache probe (if available) ==\n'
go env GOPATH GOMODCACHE 2>/dev/null || true

Repository: pingcap/tidb

Length of output: 21160


🏁 Script executed:

#!/bin/bash
set -euo pipefail

GOMODCACHE="$(go env GOMODCACHE)"
printf 'GOMODCACHE=%s\n' "$GOMODCACHE"

printf '\n== client-go module versions ==\n'
fd -a 'client-go' "$GOMODCACHE/github.com/tikv" 2>/dev/null || true

printf '\n== ScanDetail definition ==\n'
SCANDETAIL_FILE="$(rg -l -n 'type ScanDetail struct' "$GOMODCACHE/github.com/tikv/client-go" 2>/dev/null | head -n 1 || true)"
printf 'file=%s\n' "${SCANDETAIL_FILE:-<not found>}"
if [ -n "${SCANDETAIL_FILE:-}" ]; then
  sed -n '1,260p' "$SCANDETAIL_FILE"
fi

printf '\n== IaRemoteReadSegment field references in client-go ==\n'
rg -n 'IaRemoteReadSegment(Count|Bytes|Duration)' "$GOMODCACHE/github.com/tikv/client-go" 2>/dev/null || true

Repository: pingcap/tidb

Length of output: 9764


🏁 Script executed:

#!/bin/bash
set -euo pipefail

GOMODCACHE="$(go env GOMODCACHE)"
CLIENT_GO_DIR="$GOMODCACHE/github.com/tikv/client-go/v2@v2.0.8-0.20260617030124-661db4f5f4e8"

printf '== ScanDetail struct location ==\n'
rg -n 'type ScanDetail struct' "$CLIENT_GO_DIR" -S

printf '\n== ScanDetail struct body ==\n'
SCAN_FILE="$(rg -l 'type ScanDetail struct' "$CLIENT_GO_DIR" | head -n 1)"
sed -n '/type ScanDetail struct/,/^}/p' "$SCAN_FILE"

printf '\n== IaRemoteReadSegment fields in ScanDetail body ==\n'
sed -n '/type ScanDetail struct/,/^}/p' "$SCAN_FILE" | rg -n 'IaRemoteReadSegment|IARemoteReadSegment' -n || true

Repository: pingcap/tidb

Length of output: 1852


Wait for the client-go bump before merging this change.
util.ScanDetail in the current github.com/tikv/client-go/v2 dependency still has no IaRemoteReadSegmentCount, IaRemoteReadSegmentBytes, or IaRemoteReadSegmentDuration fields, so this code and its tests will not compile until that dependency update lands.

🧰 Tools
🪛 golangci-lint (2.12.2)

[error] 226-226: : # github.com/pingcap/tidb/pkg/util/execdetails [github.com/pingcap/tidb/pkg/util/execdetails.test]
pkg/util/execdetails/execdetails.go:226:24: scanDetail.IaRemoteReadSegmentCount undefined (type *"github.com/tikv/client-go/v2/util".ScanDetail has no field or method IaRemoteReadSegmentCount)
pkg/util/execdetails/execdetails.go:227:24: scanDetail.IaRemoteReadSegmentBytes undefined (type *"github.com/tikv/client-go/v2/util".ScanDetail has no field or method IaRemoteReadSegmentBytes)
pkg/util/execdetails/execdetails.go:228:24: scanDetail.IaRemoteReadSegmentDuration undefined (type *"github.com/tikv/client-go/v2/util".ScanDetail has no field or method IaRemoteReadSegmentDuration)
pkg/util/execdetails/execdetails_test.go:1194:3: unknown field IaRemoteReadSegmentCount in struct literal of type "github.com/tikv/client-go/v2/util".ScanDetail
pkg/util/execdetails/execdetails_test.go:1195:3: unknown field IaRemoteReadSegmentBytes in struct literal of type "github.com/tikv/client-go/v2/util".ScanDetail
pkg/util/execdetails/execdetails_test.go:1196:3: unknown field IaRemoteReadSegmentDuration in struct literal of type "github.com/tikv/client-go/v2/util".ScanDetail

(typecheck)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/util/execdetails/execdetails.go` around lines 213 - 231, The new IA
remote-read fields are not available in the current client-go version, so the
added accessor in GetIARemoteReadSegmentStats will not compile yet. Hold this
change until the github.com/tikv/client-go/v2 bump lands, or guard/remove the
references to util.ScanDetail.IaRemoteReadSegmentCount,
IaRemoteReadSegmentBytes, and IaRemoteReadSegmentDuration so this package and
its tests only use fields that exist in the current dependency.

Source: Linters/SAST tools

Comment on lines +202 to +207
sumIAReadSegmentCount uint64
maxIAReadSegmentCount uint64
sumIARemoteReadSegmentSize uint64
maxIARemoteReadSegmentSize uint64
sumIARemoteReadSegmentWaitTime time.Duration
maxIARemoteReadSegmentWaitTime time.Duration

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Inconsistent naming: "IAReadSegment" vs "IARemoteReadSegment".

sumIAReadSegmentCount/maxIAReadSegmentCount drop the "Remote" that appears in the sibling fields (sumIARemoteReadSegmentSize, sumIARemoteReadSegmentWaitTime) and in the upstream source field itself (scanDetail.IaRemoteReadSegmentCount, per GetIARemoteReadSegmentStats in execdetails.go). This inconsistency propagates to the exported, permanent SQL column name in reader.go (AVG_IA_READ_SEGMENT_COUNT vs AVG_IA_REMOTE_READ_SEGMENT_SIZE/_WAIT_TIME). Since these become public INFORMATION_SCHEMA.STATEMENTS_SUMMARY column names, it's worth fixing the naming now before it ships as a permanent (and inconsistent) public API surface.

✏️ Suggested rename
-	sumIAReadSegmentCount          uint64
-	maxIAReadSegmentCount          uint64
+	sumIARemoteReadSegmentCount    uint64
+	maxIARemoteReadSegmentCount    uint64
This would require corresponding renames of the exported constants/columns in reader.go (`AVG_IA_READ_SEGMENT_COUNT` → `AVG_IA_REMOTE_READ_SEGMENT_COUNT`) and in tables.go/v2 columns before release.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
sumIAReadSegmentCount uint64
maxIAReadSegmentCount uint64
sumIARemoteReadSegmentSize uint64
maxIARemoteReadSegmentSize uint64
sumIARemoteReadSegmentWaitTime time.Duration
maxIARemoteReadSegmentWaitTime time.Duration
sumIARemoteReadSegmentCount uint64
maxIARemoteReadSegmentCount uint64
sumIARemoteReadSegmentSize uint64
maxIARemoteReadSegmentSize uint64
sumIARemoteReadSegmentWaitTime time.Duration
maxIARemoteReadSegmentWaitTime time.Duration
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/util/stmtsummary/statement_summary.go` around lines 202 - 207, The field
naming is inconsistent across the statement summary stats, where
sumIAReadSegmentCount/maxIAReadSegmentCount omit “Remote” while the related
fields and upstream source use IARemoteReadSegment/ IaRemoteReadSegmentCount.
Rename the struct fields in statement_summary.go and update all matching
references in reader.go and tables.go/v2 so the exported SQL column becomes
AVG_IA_REMOTE_READ_SEGMENT_COUNT (and related names stay aligned). Use the
existing sibling symbols like sumIARemoteReadSegmentSize,
maxIARemoteReadSegmentWaitTime, and GetIARemoteReadSegmentStats as the naming
reference when updating the public INFORMATION_SCHEMA.STATEMENTS_SUMMARY
columns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/needs-triage-completed release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant