persist-txn: compaction #22619

danhhz · 2023-10-24T15:18:25Z

Compaction of data shards is initially delegated to the txns user (the storage controller). Because txn writes intentionally never read data shards and in no way depend on the sinces, the since of a data shard is free to be arbitrarily far ahead of or behind the txns upper. Data shard reads, when run through the above process, then follow the usual rules (can read at times beyond the since but not beyond the upper).

Compaction of the txns shard relies on the following invariant that is carefully maintained: every write less than the since of the txns shard has been applied. Mechanically, this is accomplished by a critical since capability held internally by the txns system. Any txn writer is free to advance it to a time once it has proven that all writes before that time have been applied.

It is advantageous to compact the txns shard aggressively so that applied writes are promptly consolidated out, minimizing the size. For a snapshot read at as_of, we need to be able to distinguish when the latest write <= as_of has been applied. The above invariant enables this as follows:

If as_of <= txns_shard.since(), then the invariant guarantees that all writes <= as_of have been applied, so we're free to read as described in the section above.
Otherwise, we haven't compacted as_of in the txns shard yet, and still have perfect information about which writes happened when. We can look at the data shard upper to determine which have been applied.

Touches MaterializeInc/database-issues#6685

Motivation

This PR adds a known-desirable feature.

Tips for reviewer

Checklist

This PR has adequate test coverage / QA involvement has been duly considered.
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
This PR includes the following user-facing behavior changes:

jkosh44 · 2023-10-27T16:58:21Z

src/persist-txn/src/txns.rs

+        self.txns_cache.update_gt(&since_ts).await;
+        self.txns_cache.compact_to(&since_ts);


Does it matter that we do this before determining the actual min_unapplied_ts? For example if a since_ts of 100 is passed in, but then we reduce it to 50. The in memory cache will be compacted to 100, while the physical txn shard will only be compacted to 50.

Yup, that's actually exactly what the "It's always safe" comment above is trying to say is okay, so this is a good hint that I should reword it somehow

jkosh44 · 2023-10-27T17:10:57Z

src/persist-txn/src/operator.rs

You probably want a second pair of eyes on this that are more familiar with the code.

Ha, it's just you and aljoscha, so you might have to be it. None of the timely/dataflow bits are changing in this PR, just debugging, the unblock_read change, and various compaction related things. Happy to walk you through it on a huddle though (maybe Monday?)

I just went through it, it seems pretty straightforward and uncontroversial.

jkosh44

LGTM

danhhz

TFTR!

danhhz · 2023-10-27T19:25:00Z

src/persist-txn/src/txns.rs

+        self.txns_cache.update_gt(&since_ts).await;
+        self.txns_cache.compact_to(&since_ts);


danhhz · 2023-10-27T19:25:11Z

src/persist-txn/src/txns.rs

+    /// Allows compaction to the txns shard as well as internal representations,
+    /// losing the ability to answer queries about times less_than since_ts.
+    ///
+    /// only call this from the singleton controller process


Whoops, fixed this fragment

danhhz · 2023-10-27T19:34:10Z

Whoops, pulled in something from another branch. Fixing

Compaction of data shards is initially delegated to the txns user (the storage controller). Because txn writes intentionally never read data shards and in no way depend on the sinces, the since of a data shard is free to be arbitrarily far ahead of or behind the txns upper. Data shard reads, when run through the above process, then follow the usual rules (can read at times beyond the since but not beyond the upper). Compaction of the txns shard relies on the following invariant that is carefully maintained: every write less than the since of the txns shard has been applied. Mechanically, this is accomplished by a critical since capability held internally by the txns system. Any txn writer is free to advance it to a time once it has proven that all writes before that time have been applied. It is advantageous to compact the txns shard aggressively so that applied writes are promptly consolidated out, minimizing the size. For a snapshot read at `as_of`, we need to be able to distinguish when the latest write `<= as_of` has been applied. The above invariant enables this as follows: - If `as_of <= txns_shard.since()`, then the invariant guarantees that all writes `<= as_of` have been applied, so we're free to read as described in the section above. - Otherwise, we haven't compacted `as_of` in the txns shard yet, and still have perfect information about which writes happened when. We can look at the data shard upper to determine which have been applied.

danhhz force-pushed the persist_txn_compaction branch 3 times, most recently from e217486 to ebcedf0 Compare October 26, 2023 17:29

danhhz marked this pull request as ready for review October 26, 2023 17:29

danhhz requested a review from a team as a code owner October 26, 2023 17:29

danhhz requested a review from jkosh44 October 26, 2023 17:29

jkosh44 reviewed Oct 27, 2023

View reviewed changes

jkosh44 approved these changes Oct 27, 2023

View reviewed changes

danhhz commented Oct 27, 2023

View reviewed changes

danhhz force-pushed the persist_txn_compaction branch from ebcedf0 to 343b96e Compare October 27, 2023 19:25

danhhz enabled auto-merge October 27, 2023 19:27

danhhz disabled auto-merge October 27, 2023 19:33

danhhz force-pushed the persist_txn_compaction branch from 343b96e to d381993 Compare October 27, 2023 19:36

danhhz enabled auto-merge October 27, 2023 19:37

danhhz merged commit 0cba4cf into MaterializeInc:main Oct 27, 2023
72 checks passed

danhhz deleted the persist_txn_compaction branch October 27, 2023 20:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

persist-txn: compaction #22619

persist-txn: compaction #22619

danhhz commented Oct 24, 2023

jkosh44 Oct 27, 2023

danhhz Oct 27, 2023

danhhz Oct 27, 2023

jkosh44 Oct 27, 2023

danhhz Oct 27, 2023

jkosh44 Oct 27, 2023

jkosh44 left a comment

danhhz left a comment

danhhz Oct 27, 2023

danhhz Oct 27, 2023

danhhz commented Oct 27, 2023

		self.txns_cache.update_gt(&since_ts).await;
		self.txns_cache.compact_to(&since_ts);

persist-txn: compaction #22619

persist-txn: compaction #22619

Conversation

danhhz commented Oct 24, 2023

Motivation

Tips for reviewer

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkosh44 left a comment

Choose a reason for hiding this comment

danhhz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danhhz commented Oct 27, 2023