Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

processor: fix a bug that will cause processor Tick get stuck when downstream is Kafka #11339

Merged
merged 14 commits into from
Jul 4, 2024

Conversation

asddongmen
Copy link
Contributor

@asddongmen asddongmen commented Jun 24, 2024

What problem does this PR solve?

Issue Number: close #11340

What is changed and how it works?

Check List

Tests

  • Unit test
  • Manual test (add detailed scripts or steps below)

Deploy a cdc cluster, create two changefeeds and synchronize the same table to both Kafka and TiDB separately.

Every 20 minutes, inject CDC to network isolation:

  • First, isolate cdc to Kafka network for 10 minutes.
  • Then, 20 minutes later, isolate cdc to downstream TiDB network for 10 minutes.

Repeat this process six times.

Before this PR, the lag in the MySQL sink changefeed was influenced by the Kafka sink changefeed when network isolation was introduced between CDC and Kafka:
nHHvw2tHOE

After this PR, only Kafka sink changefeed lag increasing during network isolation injection:
image

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Fix a bug that causes the processor tick to get stuck when the downstream is Kafka and becomes unreachable.```

Signed-off-by: dongmen <[email protected]>
Copy link
Contributor

ti-chi-bot bot commented Jun 24, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 24, 2024
@asddongmen asddongmen changed the title sink: add logs to help debug processor: fix a bug that will cause processor Tick stuck Jun 25, 2024
@asddongmen asddongmen changed the title processor: fix a bug that will cause processor Tick stuck processor: fix a bug that will cause processor Tick stuck when downstream is MQ Jun 25, 2024
@asddongmen asddongmen changed the title processor: fix a bug that will cause processor Tick stuck when downstream is MQ processor: fix a bug that will cause processor Tick get stuck when downstream is Kafka Jun 25, 2024
@asddongmen
Copy link
Contributor Author

/test all

Signed-off-by: dongmen <[email protected]>
@asddongmen asddongmen marked this pull request as ready for review June 25, 2024 02:27
@ti-chi-bot ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 25, 2024
Copy link

codecov bot commented Jun 25, 2024

Codecov Report

Attention: Patch coverage is 0% with 15 lines in your changes missing coverage. Please review.

Project coverage is 57.5747%. Comparing base (04a7d6a) to head (54b828b).

Additional details and impacted files
Components Coverage Δ
cdc 61.3431% <0.0000%> (-0.0342%) ⬇️
dm 51.1809% <ø> (-0.0585%) ⬇️
engine 63.3526% <ø> (+0.0070%) ⬆️
Flag Coverage Δ
unit 57.5747% <0.0000%> (-0.0401%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

@@               Coverage Diff                @@
##             master     #11339        +/-   ##
================================================
- Coverage   57.6147%   57.5747%   -0.0401%     
================================================
  Files           849        849                
  Lines        126294     126262        -32     
================================================
- Hits          72764      72695        -69     
- Misses        48121      48155        +34     
- Partials       5409       5412         +3     

@ti-chi-bot ti-chi-bot bot added affect-ticdc-config-docs Pull requests that affect TiCDC configuration docs. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 25, 2024
Signed-off-by: dongmen <[email protected]>
@asddongmen asddongmen marked this pull request as draft June 25, 2024 03:26
@ti-chi-bot ti-chi-bot bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 25, 2024
pkg/config/sink.go Outdated Show resolved Hide resolved
@asddongmen
Copy link
Contributor Author

/retest

1 similar comment
@3AceShowHand
Copy link
Contributor

/retest

Copy link
Contributor

ti-chi-bot bot commented Jun 27, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: 3AceShowHand, hicqu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

ti-chi-bot bot commented Jun 27, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-06-25 06:40:09.196671537 +0000 UTC m=+701735.682160364: ☑️ agreed by hicqu.
  • 2024-06-27 08:09:51.558875886 +0000 UTC m=+879918.044364718: ☑️ agreed by 3AceShowHand.

@asddongmen
Copy link
Contributor Author

/retest

1 similar comment
@CharlesCheung96
Copy link
Contributor

/retest

@ti-chi-bot ti-chi-bot bot merged commit 695f932 into pingcap:master Jul 4, 2024
28 checks passed
@asddongmen asddongmen added needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. needs-cherry-pick-release-8.1 Should cherry pick this PR to release-8.1 branch. labels Jul 4, 2024
ti-chi-bot pushed a commit to ti-chi-bot/tiflow that referenced this pull request Jul 4, 2024
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-7.1: #11387.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-7.5: #11388.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-6.5: #11389.

ti-chi-bot pushed a commit to ti-chi-bot/tiflow that referenced this pull request Jul 4, 2024
ti-chi-bot pushed a commit to ti-chi-bot/tiflow that referenced this pull request Jul 4, 2024
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-8.1: #11390.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affect-ticdc-config-docs Pull requests that affect TiCDC configuration docs. approved lgtm needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. needs-cherry-pick-release-8.1 Should cherry pick this PR to release-8.1 branch. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

etcd worker tick stuck up to 2 minutes periodically when cdc can't connect to Kafka server
5 participants