Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

because of "remove operator because region disappeared" ,my cdc mission blocked more than 24 hours,out of gc-ttl #11267

Open
iceran opened this issue Jun 6, 2024 · 4 comments
Labels
area/ticdc Issues or PRs related to TiCDC. type/bug This is a bug.

Comments

@iceran
Copy link

iceran commented Jun 6, 2024

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiDB version? (Required)

@iceran iceran added the type/bug This is a bug. label Jun 6, 2024
@iceran
Copy link
Author

iceran commented Jun 6, 2024

pd warning log && cdc warning log as blow
[2024/06/05 08:26:12.329 +08:00] [INFO] [operator_controller.go:556] ["operator finish"] [region-id=5592114] [takes=1.120616751s] [operator=""merge-region {merge: region 5477336 to 5592114} (kind:merge,region,leader, region:5592114(21566, 9659), createAt:2024-06-05 08:26:11.20829815 +0800 CST m=+20716827.197696760, startAt:2024-06-05 08:26:11.208665301 +0800 CST m=+20716827.198063995, currentStep:1, size:94, steps:[0:{merge region 5477336 into region 5592114}],timeout:[58m0s]) finished""] [additional-info=]
[2024/06/05 08:26:12.786 +08:00] [WARN] [operator_controller.go:219] ["remove operator because region disappeared"] [region-id=5477336] [operator="merge-region {merge: region 5477336 to 5592114} (kind:merge,region,leader, region:5477336(21543, 9659), createAt:2024-06-05 08:26:11.208296233 +0800 CST m=+20716827.197694844, startAt:2024-06-05 08:26:11.208563946 +0800 CST m=+20716827.197962758, currentStep:9, size:19, steps:[0:{add learner peer 5604772 on store 1}, 1:{add learner peer 5604773 on store 3}, 2:{add learner peer 5604771 on store 2352143}, 3:{use joint consensus, promote learner peer 5604772 on store 1 to voter, promote learner peer 5604773 on store 3 to voter, promote learner peer 5604771 on store 2352143 to voter, demote voter peer 5604707 on store 2 to learner, demote voter peer 5604753 on store 4 to learner, demote voter peer 5604628 on store 6 to learner}, 4:{transfer leader from store 6 to store 1}, 5:{leave joint state, promote learner peer 5604772 on store 1 to voter, promote learner peer 5604773 on store 3 to voter, promote learner peer 5604771 on store 2352143 to voter, demote voter peer 5604707 on store 2 to learner, demote voter peer 5604753 on store 4 to learner, demote voter peer 5604628 on store 6 to learner}, 6:{remove peer on store 2}, 7:{remove peer on store 4}, 8:{remove peer on store 6}, 9:{merge region 5477336 into region 5592114}],timeout:[58m0s])"]
[2024/06/05 08:26:12.787 +08:00] [INFO] [operator_controller.go:594] ["operator canceled"] [region-id=5477336] [takes=1.578487395s] [operator=""merge-region {merge: region 5477336 to 5592114} (kind:merge,region,leader, region:5477336(21543, 9659), createAt:2024-06-05 08:26:11.208296233 +0800 CST m=+20716827.197694844, startAt:2024-06-05 08:26:11.208563946 +0800 CST m=+20716827.197962758, currentStep:9, size:19, steps:[0:{add learner peer 5604772 on store 1}, 1:{add learner peer 5604773 on store 3}, 2:{add learner peer 5604771 on store 2352143}, 3:{use joint consensus, promote learner peer 5604772 on store 1 to voter, promote learner peer 5604773 on store 3 to voter, promote learner peer 5604771 on store 2352143 to voter, demote voter peer 5604707 on store 2 to learner, demote voter peer 5604753 on store 4 to learner, demote voter peer 5604628 on store 6 to learner}, 4:{transfer leader from store 6 to store 1}, 5:{leave joint state, promote learner peer 5604772 on store 1 to voter, promote learner peer 5604773 on store 3 to voter, promote learner peer 5604771 on store 2352143 to voter, demote voter peer 5604707 on store 2 to learner, demote voter peer 5604753 on store 4 to learner, demote voter peer 5604628 on store 6 to learner}, 6:{remove peer on store 2}, 7:{remove peer on store 4}, 8:{remove peer on store 6}, 9:{merge region 5477336 into region 5592114}],timeout:[58m0s])""] [additional-info=]
[2024/06/05 08:26:13.400 +08:00] [INFO] [operator_controller.go:443] ["add operator"] [region-id=5604075] [operator=""move-hot-write-peer {mv peer: store [2352140] to [2]} (kind:hot-region,region,leader, region:5604075(19008, 6875), createAt:2024-06-05 08:26:13.400048836 +0800 CST m=+20716829.389447411, startAt:0001-01-01 00:00:00 +0000 UTC, currentStep:0, size:98, steps:[0:{add learner peer 5604774 on store 2}, 1:{transfer leader from store 2352140 to store 4}, 2:{use joint consensus, promote learner peer 5604774 on store 2 to voter, demote voter peer 5604649 on store 2352140 to learner}, 3:{leave joint state, promote learner peer 5604774 on store 2 to voter, demote voter peer 5604649 on store 2352140 to learner}, 4:{remove peer on store 2352140}],timeout:[18m0s])""] [additional-info=]
[2024/06/05 08:26:13.400 +08:00] [INFO] [operator_controller.go:642] ["send schedule command"] [region-id=5604075] [step="add learner peer 5604774 on store 2"] [source=create]

[2024/06/05 08:26:12.329 +08:00] [INFO] [client.go:560] ["region failed"] [span="[748000000000001fffa65f720162303132ff32646634ff333731ff6134343632ff6162ff326463306566ff66ff30343765313766ffff0000000000000000fff701544152474554ff0000fd0000000000fa, 748000000000001fffa65f720165306465ff37346362ff323236ff3134343436ff6165ff323964616563ff32ff61376664356264ffff0000000000000000fff701544152474554ff0000fd0000000000fa)"] [regionId=5477336] [error="[CDC:ErrEventFeedEventError]eventfeed returns event error: region_not_found:<region_id:5477336 > "] [errorVerbose="[CDC:ErrEventFeedEventError]eventfeed returns event error: region_not_found:<region_id:5477336 > \ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/[email protected]/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\tgithub.com/pingcap/[email protected]/normalize.go:164\ngithub.com/pingcap/tiflow/pkg/errors.WrapError\n\tgithub.com/pingcap/tiflow/pkg/errors/helper.go:34\ngithub.com/pingcap/tiflow/cdc/kv.(*regionWorker).processEvent\n\tgithub.com/pingcap/tiflow/cdc/kv/region_worker.go:377\ngithub.com/pingcap/tiflow/cdc/kv.(*regionWorker).eventHandler\n\tgithub.com/pingcap/tiflow/cdc/kv/region_worker.go:506\ngithub.com/pingcap/tiflow/cdc/kv.(*regionWorker).run.func4\n\tgithub.com/pingcap/tiflow/cdc/kv/region_worker.go:595\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/[email protected]/errgroup/errgroup.go:75\nruntime.goexit\n\truntime/asm_amd64.s:1594"]
[2024/06/05 08:26:12.329 +08:00] [WARN] [client.go:700] ["send request to stream failed"] [namespace=default] [changefeed=phs-dws-phs-us] [tableID=8102] [tableName=phs_dws.dws_phs_pub_hierarchy_rel] [addr=10.9.113.7:20160] [storeID=1] [regionID=5592114] [requestID=69484] [error=EOF]
[2024/06/05 08:26:12.330 +08:00] [INFO] [client.go:560] ["region failed"] [span="[748000000000001fffa65f720000000000fa, 748000000000001fffa65f720165306465ff37346362ff323236ff3134343436ff6165ff323964616563ff32ff61376664356264ffff0000000000000000fff701544152474554ff0000fd0000000000fa)"] [regionId=5592114] [error="send request to store error"]
[2024/06/05 08:26:12.330 +08:00] [INFO] [region_range_lock.go:256] ["try lock range staled"] [changefeed=default.phs-dws-phs-us] [lockID=144] [regionID=5592114] [startKey=748000000000001fffa65f720000000000fa] [endKey=748000000000001fffa65f720165306465ff37346362ff323236ff3134343436ff6165ff323964616563ff32ff61376664356264ffff0000000000000000fff701544152474554ff0000fd0000000000fa] [allOverlapping="["regionID: 5592114, ver: 21567, start: 748000000000001fffa65f720000000000fa, end: 748000000000001fffa65f720165306465ff37346362ff323236ff3134343436ff6165ff323964616563ff32ff61376664356264ffff0000000000000000fff701544152474554ff0000fd0000000000fa"]"]
[2024/06/05 08:26:12.330 +08:00] [INFO] [client.go:506] ["request expired"] [namespace=default] [changefeed=phs-dws-phs-us] [regionID=5592114] [span="[748000000000001fffa65f720000000000fa, 748000000000001fffa65f720165306465ff37346362ff323236ff3134343436ff6165ff323964616563ff32ff61376664356264ffff0000000000000000fff701544152474554ff0000fd0000000000fa)"] [resolvedTs=450244685660946474] [retrySpans="[]"]
[2024/06/05 08:26:12.330 +08:00] [INFO] [client.go:659] ["creating new stream to store to send request"] [namespace=default] [changefeed=phs-dws-phs-us] [regionID=5592114] [requestID=69485] [storeID=1] [addr=10.9.113.7:20160]
[2024/06/05 08:26:12.348 +08:00] [INFO] [client.go:982] ["stream to store closed"] [namespace=default] [changefeed=phs-dws-phs-us] [addr=10.9.113.9:20160] [storeID=6]
[2024/06/05 08:26:12.382 +08:00] [INFO] [client.go:982] ["stream to store closed"] [namespace=default] [changefeed=phs-dws-phs-us] [addr=10.9.113.7:20160] [storeID=1]
[2024/06/05 08:26:12.404 +08:00] [INFO] [client.go:982] ["stream to store closed"] [namespace=default] [changefeed=phs-dws-phs-us] [addr=10.9.113.7:20160] [storeID=1]
[2024/06/05 08:26:12.408 +08:00] [INFO] [region_worker.go:421] ["region worker closed by error"] [namespace=default] [changefeed=phs-dws-phs-us]
[2024/06/05 08:26:12.408 +08:00] [INFO] [client.go:560] ["region failed"] [span="[748000000000001fffa65f720000000000fa, 748000000000001fffa65f720165306465ff37346362ff323236ff3134343436ff6165ff323964616563ff32ff61376664356264ffff0000000000000000fff701544152474554ff0000fd0000000000fa)"] [regionId=5592114] [error="[CDC:ErrEventFeedAborted]single event feed aborted"]

@iceran
Copy link
Author

iceran commented Jun 6, 2024

What I am puzzled about is why cdc not self-cure

@lance6716 lance6716 transferred this issue from pingcap/tidb Jun 6, 2024
@lance6716
Copy link
Contributor

Hi, CDC is maintained in https://github.com/pingcap/tiflow . I have moved your issue

@jebter jebter added the area/ticdc Issues or PRs related to TiCDC. label Jun 14, 2024
@github-actions github-actions bot added this to Need Triage in Question and Bug Reports Jun 14, 2024
@fubinzh
Copy link

fubinzh commented Jun 17, 2024

@iceran could you please provide more info following the bug report template?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ticdc Issues or PRs related to TiCDC. type/bug This is a bug.
Projects
Development

No branches or pull requests

4 participants