ops catalog: thread through collection "version id" as part of inferred schemas #1969

psFried · 2025-02-25T16:31:02Z

Inferred schemas are materialized into the inferred_schemas table by our ops catalog. The table is keyed on the collection_name, which presents a problem when a collection is deleted and re-created. We can't tell whether the schema is from the prior version of the collection or the new one, and inferred schema updates from both versions will be merged into the same row.

The solution we'd like to implement is to thread through the "version id" of the collection as part of the inferred schemas, and make it part of the tables primary key. The version id is the build id at which the collection was created. It's a part of the collection journal names, like in the example acmeCo/foo/11223344aabbccdd/pivot=00. The desired outcome is for the inferred_schemas table to have collection_name = 'acmeCo/foo', version_id = '11223344aabbccdd'.

Publications will then only use the inferred schema matching the version_id for each collection in the draft.

The text was updated successfully, but these errors were encountered:

jgraettinger · 2025-02-25T17:11:22Z

One nit: elsewhere when we use version, especially within connector protocols, it means the current build ID of the task / spec.

What you're describing is the lifecycle ID or evolution ID (name TBD) of the collection.
I guess we should pick a name and run with it, huh 😆

psFried added control-plane data-plane enhance New feature or enhancement with UX impact labels Feb 25, 2025

psFried self-assigned this Feb 25, 2025

psFried mentioned this issue Mar 5, 2025

Phil/all natural ops catalog enhancement #1962

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ops catalog: thread through collection "version id" as part of inferred schemas #1969

ops catalog: thread through collection "version id" as part of inferred schemas #1969

psFried commented Feb 25, 2025

jgraettinger commented Feb 25, 2025

ops catalog: thread through collection "version id" as part of inferred schemas #1969

ops catalog: thread through collection "version id" as part of inferred schemas #1969

Comments

psFried commented Feb 25, 2025

jgraettinger commented Feb 25, 2025