-
Notifications
You must be signed in to change notification settings - Fork 284
(fix): cdc watermark updater #22821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 3.0-dev
Are you sure you want to change the base?
(fix): cdc watermark updater #22821
Conversation
PR Compliance Guide 🔍Below is a summary of compliance checks for this PR:
Compliance status legend🟢 - Fully Compliant🟡 - Partial Compliant 🔴 - Not Compliant ⚪ - Requires Further Human Verification 🏷️ - Compliance label |
||||||||||||||||||||||||
PR Code Suggestions ✨Explore these optional code suggestions:
|
||||||||||||
User description
What type of PR is this?
Which issue(s) this PR fixes:
issue #22718
What this PR does / why we need it:
cdc watermark updater
PR Type
Bug fix, Enhancement
Description
Add watermark stall detection with configurable thresholds
Track snapshot progress and emit warnings when watermark stops advancing
Implement automatic error recovery when stall threshold exceeded
Add metrics for monitoring table stream health and stall detection
Diagram Walkthrough
File Walkthrough
table_change_stream.go
Add watermark stall detection and monitoringpkg/cdc/table_change_stream.go
TableChangeStreamstructTableChangeStreamOptionpattern with configurable thresholdshandleSnapshotNoProgress()to detect and handle stalled snapshotsresetWatermarkStallState()andonWatermarkAdvanced()lifecyclemethods
processWithTxn()before processingcdc_metrics.go
Add snapshot stall counter metricpkg/util/metric/v2/cdc_metrics.go
CdcTableNoProgressCountermetric to track snapshot stalloccurrences
initCDCMetrics()functiontable_change_stream_test.go
Add comprehensive tests for stall detectionpkg/cdc/table_change_stream_test.go
TestTableChangeStream_HandleSnapshotNoProgress_WarningAndResetforbasic stall detection
TestTableChangeStream_HandleSnapshotNoProgress_ThresholdExceededfor error threshold
TestTableChangeStream_HandleSnapshotNoProgress_WarningThrottleforwarning throttling
TestTableChangeStream_HandleSnapshotNoProgress_Defaultsfordefault configuration
createTestStream()helper to accept optionswatermark_updater_test.go
Update watermark updater testspkg/cdc/watermark_updater_test.go
TestCDCWatermarkUpdater_UpdateWatermarkErrMsgto expect successinstead of error
TestCDCWatermarkUpdater_RemoveThenUpdateErrMsgto verify errorupdates work after removal
watermark_updater.go
Fix watermark updater error handlingpkg/cdc/watermark_updater.go
onJobs()to handleErrNoWatermarkFoundgracefully inJT_CDC_UpdateWMErrMsgcasereadKeysBufferwhen not foundexecReadWM()to populate cache fromreadKeysBufferbeforeprocessing jobs
CDC_USER_GUIDE.md
Document snapshot stall detection featurepkg/cdc/CDC_USER_GUIDE.md
detection behavior
mo_cdc_table_snapshot_no_progress_totalcountergauges