Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(stream): normalize unmatched updates #20350

Merged
merged 8 commits into from
Feb 3, 2025
Merged

Conversation

kwannoel
Copy link
Contributor

@kwannoel kwannoel commented Jan 31, 2025

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Closes: #20342

The bug is that stream_key and storage_key can mismatch for an MV, and can result in UpdateDelete + UpdateInsert pairs being split up.

dev=> explain create materialized view m1 as select * from t order by v2, v3;
                                                      QUERY PLAN                                                       
-----------------------------------------------------------------------------------------------------------------------
 StreamMaterialize { columns: [v1, v2, v3, v4, v5], stream_key: [v1], pk_columns: [v2, v3, v1], pk_conflict: NoCheck }
 └─StreamTableScan { table: t, columns: [v1, v2, v3, v4, v5] }

Details:

  1. If upstream updates some record, it will be updated on some stream_key .
  2. This propagates to the backfilling MV.
  3. When receiving barrier, we will flush buffered upstream chunks. Here we have logic to remove rows which have storage_key > current_pos, where current_pos is the storage_key we have backfilled until.
  4. This means that an UpdateDelete + UpdateInsert pair could be split up, because while their stream keys are the same, their storage keys might be different, leading to one being pruned, but not the other. An example is when upstream is an MV with ORDER BY on non-pk columns.
  5. Subsequently, UpdateDelete gets propagated to dispatcher, and the dispatcher has some optimization to remove redundant UpdateDelete and UpdateInsert pairs. We may lose an op entry here in this code branch. Then op.len() != col.len(), and we will panic when reconstructing the chunk.

This bug is sort of worked around with https://github.com/risingwavelabs/risingwave/pull/20176/files. Since dispatcher will rewrite U+/U- going to different vnodes into +/-, and dist_key is set to order key. But this PR defensively handles it in backfill as well.

For unmatched update delete, i.e. storage_key <= current_pos, we can rewrite it to a normal delete. Because in the next snapshot read, we should read the new update insert.
For unmatched update insert, we have not backfilled the corresponding update delete key from storage yet. So there won’t be a conflict to just insert it. We can rewrite it to a normal insert.

We also reconstruct the ops lazily, since in most cases it should not be necessary, only when there' a hanging update.

Checklist

  • I have written necessary rustdoc comments.
  • I have added necessary unit tests and integration tests.
  • I have added test labels as necessary.
  • I have added fuzzing tests or opened an issue to track them.
  • My PR contains breaking changes.
  • My PR changes performance-critical code, so I will run (micro) benchmarks and present the results.
  • My PR contains critical fixes that are necessary to be merged into the latest release.

Documentation

  • My PR needs documentation updates.
Release note

@kwannoel kwannoel marked this pull request as ready for review January 31, 2025 04:29
@BugenZhao BugenZhao requested a review from stdrc January 31, 2025 06:58
@BugenZhao
Copy link
Member

BugenZhao commented Jan 31, 2025

  1. If upstream updates some record, it will be updated on some stream_key .

This reminds me (again) of the discussion in (#12539. In my opinion, if maintaining the U- and U+ invariants proves more challenging than beneficial, should we consider only retaining - and + in the system (instead of patching or rewriting everywhere)?

Copy link
Contributor

@chenzl25 chenzl25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix LGTM, thanks!

@kwannoel
Copy link
Contributor Author

  1. If upstream updates some record, it will be updated on some stream_key .

This reminds me (again) of the discussion in (#12539. In my opinion, if maintaining the U- and U+ invariants proves more challenging than beneficial, should we consider only retaining - and + in the system (instead of patching or rewriting everywhere)?

Generally agree. We may lose some traceability, but we will gain increased performance and code simplicity. We can now encode ops in a bitmap.

@kwannoel kwannoel force-pushed the kwannoel/fix-backfill-updates branch from 71d02b8 to 043f697 Compare February 3, 2025 06:25
Copy link
Contributor Author

kwannoel commented Feb 3, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@kwannoel kwannoel added this pull request to the merge queue Feb 3, 2025
Merged via the queue into main with commit 67ee5a5 Feb 3, 2025
29 checks passed
@kwannoel kwannoel deleted the kwannoel/fix-backfill-updates branch February 3, 2025 07:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants