-
Notifications
You must be signed in to change notification settings - Fork 284
(fix): cdc watermark updater main #22822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
You are nearing your monthly Qodo Merge usage quota. For more information, please visit here. PR Compliance Guide 🔍Below is a summary of compliance checks for this PR:
Compliance status legend🟢 - Fully Compliant🟡 - Partial Compliant 🔴 - Not Compliant ⚪ - Requires Further Human Verification 🏷️ - Compliance label |
||||||||||||||||||||||||
|
You are nearing your monthly Qodo Merge usage quota. For more information, please visit here. PR Code Suggestions ✨Explore these optional code suggestions:
|
|||||||||||
User description
What type of PR is this?
Which issue(s) this PR fixes:
issue #22718
What this PR does / why we need it:
cdc watermark updater main
PR Type
Bug fix, Enhancement
Description
Add watermark stall detection to CDC table streams with configurable thresholds
Track snapshot progress and emit warnings when watermark fails to advance
Implement retryable error handling when stall threshold exceeded
Add new metrics for monitoring snapshot stalls and table activity
Fix watermark updater to handle missing watermarks gracefully
Diagram Walkthrough
File Walkthrough
table_change_stream.go
Implement watermark stall detection frameworkpkg/cdc/table_change_stream.go
TableChangeStreamstructTableChangeStreamOptionpattern with configurable thresholdsfor stall detection and warning intervals
handleSnapshotNoProgress()to detect and report snapshot timestampstalls with throttled warnings
resetWatermarkStallState()to clear stall tracking when progressresumes
onWatermarkAdvanced()callback to update metrics and reset stallstate on successful watermark advancement
processWithTxn()workflowcdc_metrics.go
Add snapshot stall counter metricpkg/util/metric/v2/cdc_metrics.go
CdcTableNoProgressCountermetric to track snapshot stalloccurrences per table
initCDCMetrics()table_change_stream_test.go
Add comprehensive stall detection testspkg/cdc/table_change_stream_test.go
readGaugeValue()andreadCounterValue()for metricassertions
TestTableChangeStream_HandleSnapshotNoProgress_WarningAndResettoverify warning emission and metric reset
TestTableChangeStream_HandleSnapshotNoProgress_ThresholdExceededto verify error on stall threshold breach
TestTableChangeStream_HandleSnapshotNoProgress_WarningThrottletoverify warning throttling behavior
TestTableChangeStream_HandleSnapshotNoProgress_Defaultsto verifydefault configuration values
createTestStream()helper to accept optional configurationparameters
watermark_updater_test.go
Update watermark updater testspkg/cdc/watermark_updater_test.go
TestCDCWatermarkUpdater_UpdateWatermarkErrMsgto expect successinstead of error
TestCDCWatermarkUpdater_RemoveThenUpdateErrMsgto verify gracefulhandling after watermark removal
watermark_updater.go
Fix watermark updater error handlingpkg/cdc/watermark_updater.go
onJobs()to gracefully handleErrNoWatermarkFoundinJT_CDC_UpdateWMErrMsgcasereadKeysBufferinstead offailing
execReadWM()to persist successfullyread watermarks
errorspackage for error type checkingCDC_USER_GUIDE.md
Document snapshot stall detection featurepkg/cdc/CDC_USER_GUIDE.md
behavior and metrics
interval)
mo_cdc_table_snapshot_no_progress_totalto metrics reference table