Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

catalog: Simplify shard ID generation #30597

Merged
merged 1 commit into from
Nov 25, 2024
Merged

Conversation

jkosh44
Copy link
Contributor

@jkosh44 jkosh44 commented Nov 22, 2024

The catalog has functionality to deterministically generate a random shard ID for a given environment. It used when we need a shard before the catalog is fully opened. It is strictly worse than generating a totally random shard because we limit the amount of randomness.

The catalog also has an un-migratable collection called "settings" which maps string keys to string values. This collection is accessible immediately after reading in a snapshot of the catalog, before we finish opening the catalog. Any shard ID used after reading in the snapshot can be generated fully randomly and stored in the settings collection.

Two shards fit into this category:

  • The builtin migration shard.
  • The expression cache shard.

This commit adds a migration to move both those shards to the settings collection for existing environments. New environments will generate the shards randomly and stash the IDs in the settings collection.

Motivation

This PR adds a feature that has not yet been specified.

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

@jkosh44 jkosh44 force-pushed the aux-shards branch 4 times, most recently from 3d6b161 to be99cc1 Compare November 22, 2024 16:22
The catalog has functionality to deterministically generate a random
shard ID for a given environment. It used when we need a shard before
the catalog is fully opened. It is strictly worse than generating a
totally random shard because we limit the amount of randomness.

The catalog also has an un-migratable collection called "settings"
which maps string keys to string values. This collection is accessible
immediately after reading in a snapshot of the catalog, before we
finish opening the catalog. Any shard ID used after reading in the
snapshot can be generated fully randomly and stored in the settings
collection.

Two shards fit into this category:

  - The builtin migration shard.
  - The expression cache shard.

This commit adds a migration to move both those shards to the settings
collection for existing environments. New environments will generate
the shards randomly and stash the IDs in the settings collection.
@jkosh44 jkosh44 marked this pull request as ready for review November 22, 2024 17:24
@jkosh44 jkosh44 requested a review from a team as a code owner November 22, 2024 17:24
Copy link
Member

@ParkMyCar ParkMyCar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jkosh44 jkosh44 merged commit 2141dfd into MaterializeInc:main Nov 25, 2024
221 of 223 checks passed
@jkosh44 jkosh44 deleted the aux-shards branch November 25, 2024 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants