[experiment] Port reftable from Git C implementation and integrate as backend#2452
Draft
[experiment] Port reftable from Git C implementation and integrate as backend#2452
Conversation
8f1b751 to
38bb0ad
Compare
Why now The goal is to land the reftable port as a standalone crate with strong parity coverage before any backend integration churn. What changed This squashed commit contains all standalone `gix-reftable` work that was previously split across 9 commits: - workspace wiring for a dedicated `gix-reftable` crate - low-level primitives (constants, varint, hash-kind, errors) - record model and encode/decode for ref/log/obj/index records - block source and single-table reader - merged-table iterators with pq/tree helpers - table writer with limits/index emission and options - stack transactions, reload, auto-compaction, and fsck - upstream-inspired `u-reftable-*` parity unit tests - selected `t0610`/`t0613`/`t0614` scenario parity tests Why this order This commit is a squash of the previously reviewed sequence where each layer built on the previous one (primitives -> records -> io -> merged iteration -> writer -> stack -> tests). What it unlocks next A clean standalone reftable library baseline that can be integrated later into `gix-ref`/`gix` in follow-up work. Prompt (verbatim) Look at the reftable implementation at /Users/byron/dev/github.com/git/git and port it over to Rust in its own `gix-reftable` crate. Be sure to capture specific tests that exist. Follow through with the entire plan. Do not stop until it's all done. After each step, make a commit with a meaningful message and motivation. Show how the commit relates to the previous commit, and at least hint at how it's going to be relevant in future commits. PLEASE IMPLEMENT THIS PLAN: # Commit-By-Commit Execution Plan: Reftable Port + Integration ## Summary Implement the full reftable port in `gix-reftable`, integrate it as a real backend in `gix-ref`/`gix`, and land parity tests in small, reviewable commits. Each commit is intentionally chained: it stabilizes one layer, then unlocks the next. ## Commit Sequence 1. **`workspace: add gix-reftable crate skeleton and wire it into Cargo workspace`** Motivation: create the isolated crate boundary first so all subsequent work lands incrementally. Relates to previous: baseline/no-op starting point. Future relevance: all reftable code/tests depend on this crate existing. 2. **`gix-reftable: port basics/constants/error/varint primitives from git/reftable`** Motivation: establish byte-order, varint, hash-id, and error semantics shared by all modules. Relates to previous: fills in core primitives in the new crate. Future relevance: record/block/table/writer code will reuse these primitives directly. 3. **`gix-reftable: implement record model and encode/decode parity (ref/log/obj/index)`** Motivation: record correctness is the format contract; everything else composes it. Relates to previous: consumes primitives and defines concrete wire payload behavior. Future relevance: block IO and iterators can now operate on typed records. 4. **`gix-reftable: implement block + blocksource + table reader`** Motivation: make reftable files readable end-to-end (header/sections/restarts/seek). Relates to previous: uses record codec to decode table contents. Future relevance: merged tables and stack logic need a working single-table reader. 5. **`gix-reftable: implement merged table iterators, pq, and tree helpers`** Motivation: parity for cross-table iteration and seek behavior. Relates to previous: builds on table reader to support multi-table views. Future relevance: stack and backend integration depend on merged iteration semantics. 6. **`gix-reftable: implement writer with limits/index emission/write options`** Motivation: enable producing valid tables and exercising write-path parity tests. Relates to previous: complements reader path using the same record/block contracts. Future relevance: stack transactions and compaction need writer callbacks. 7. **`gix-reftable: implement stack transactions, auto-compaction, reload, and fsck`** Motivation: complete operational backend behavior (`tables.list`, addition/commit, verify). Relates to previous: stack orchestrates reader/writer modules already landed. Future relevance: this is the direct foundation for `gix-ref` backend adapter. 8. **`gix-reftable/tests: port upstream u-reftable-* unit suites with 1:1 case mapping`** Motivation: lock behavioral parity at the library level before integration churn. Relates to previous: validates all crate modules in isolation. Future relevance: reduces regression risk when wiring into `gix-ref` and `gix`. 9. **`gix-reftable/tests: add selected t0610/t0613/t0614 behavior parity integration tests`** Motivation: cover high-value shell behavior in Rust tests (transactions/options/fsck/worktree). Relates to previous: adds scenario-level confidence on top of unit parity. Future relevance: these tests protect future backend integration refactors. 10. **`gix-ref: activate backend-agnostic store abstraction (files + reftable state)`** Motivation: remove hard coupling to file-store without changing behavior yet. Relates to previous: prepares host crate interface for plugging in reftable. Future relevance: next commit injects real reftable-backed implementation. 11. **`gix-ref: add reftable-backed store adapter and route find/iter/transaction operations`** Motivation: make `gix-ref` actually operate on reftable repositories. Relates to previous: fills the new abstraction with a concrete second backend. Future relevance: `gix` can now switch backend based on repository configuration. 12. **`gix: switch RefStore to backend-capable store and detect extensions.refStorage=reftable`** Motivation: enable end-to-end opening and reading of reftable repos in top-level API. Relates to previous: consumes backend-capable `gix-ref` APIs. Future relevance: unlocks fixing existing tests that currently assert reftable unsupported. 13. **`gix: make reference iteration/peeling/fetch update paths backend-agnostic`** Motivation: remove residual file-only assumptions in critical flows. Relates to previous: completes runtime behavior for common operations. Future relevance: ensures future features (e.g., optimizations) won’t regress reftable path. 14. **`tests: update reftable open/head expectations and add cross-backend regression coverage`** Motivation: reflect new supported behavior and guard interoperability paths. Relates to previous: validates functional integration in `gix` public workflows. Future relevance: serves as long-term guardrail for both backends. 15. **`docs/status: document reftable support, sha256 boundary, and update crate-status`** Motivation: finalize user/developer-facing contract and current limitations. Relates to previous: documents the now-landed behavior. Future relevance: provides clear baseline for follow-up work (end-to-end SHA-256 in `gix`). ## Per-Commit Validation Rule For each commit, run the smallest relevant test slice before committing, then run a broader slice periodically: - crate-local unit tests for touched modules, - `gix-reftable` parity suites, - `gix-ref` targeted tests, - `gix` targeted repository/reference tests. ## Commit Message Format Rule Every commit body will include: - **Why now** (motivation), - **What changed** (scope), - **Why this order** (relation to previous commit), - **What it unlocks next** (future relevance). ## Assumptions - Source parity target is Git’s in-tree reftable C implementation and tests. - `gix-reftable` supports SHA-1 and SHA-256; `gix` integration remains SHA-1-only in this batch. - No squashing: one commit per step as listed above.
38bb0ad to
94793bb
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This draft now intentionally scopes to standalone
gix-reftableonly.It ports Git's in-tree reftable implementation into the dedicated crate and includes crate-local parity/unit/integration tests, but does not include wiring into
gix-reforgixin this PR.Included commit range now ends at:
33a91c9690(gix-reftable/tests: add selected t0610/t0613/t0614 behavior parity integration tests)Removed from this PR (deferred):
gix-refstore changesgix-refgixbackend detection/routing changesgixtest updatesOriginal 15-step plan (for traceability)
Commit-By-Commit Execution Plan: Reftable Port + Integration
Summary
Implement the full reftable port in
gix-reftable, integrate it as a real backend ingix-ref/gix, and land parity tests in small, reviewable commits.Each commit is intentionally chained: it stabilizes one layer, then unlocks the next.
Commit Sequence
workspace: add gix-reftable crate skeleton and wire it into Cargo workspaceMotivation: create the isolated crate boundary first so all subsequent work lands incrementally.
Relates to previous: baseline/no-op starting point.
Future relevance: all reftable code/tests depend on this crate existing.
gix-reftable: port basics/constants/error/varint primitives from git/reftableMotivation: establish byte-order, varint, hash-id, and error semantics shared by all modules.
Relates to previous: fills in core primitives in the new crate.
Future relevance: record/block/table/writer code will reuse these primitives directly.
gix-reftable: implement record model and encode/decode parity (ref/log/obj/index)Motivation: record correctness is the format contract; everything else composes it.
Relates to previous: consumes primitives and defines concrete wire payload behavior.
Future relevance: block IO and iterators can now operate on typed records.
gix-reftable: implement block + blocksource + table readerMotivation: make reftable files readable end-to-end (header/sections/restarts/seek).
Relates to previous: uses record codec to decode table contents.
Future relevance: merged tables and stack logic need a working single-table reader.
gix-reftable: implement merged table iterators, pq, and tree helpersMotivation: parity for cross-table iteration and seek behavior.
Relates to previous: builds on table reader to support multi-table views.
Future relevance: stack and backend integration depend on merged iteration semantics.
gix-reftable: implement writer with limits/index emission/write optionsMotivation: enable producing valid tables and exercising write-path parity tests.
Relates to previous: complements reader path using the same record/block contracts.
Future relevance: stack transactions and compaction need writer callbacks.
gix-reftable: implement stack transactions, auto-compaction, reload, and fsckMotivation: complete operational backend behavior (
tables.list, addition/commit, verify).Relates to previous: stack orchestrates reader/writer modules already landed.
Future relevance: this is the direct foundation for
gix-refbackend adapter.gix-reftable/tests: port upstream u-reftable-* unit suites with 1:1 case mappingMotivation: lock behavioral parity at the library level before integration churn.
Relates to previous: validates all crate modules in isolation.
Future relevance: reduces regression risk when wiring into
gix-refandgix.gix-reftable/tests: add selected t0610/t0613/t0614 behavior parity integration testsMotivation: cover high-value shell behavior in Rust tests (transactions/options/fsck/worktree).
Relates to previous: adds scenario-level confidence on top of unit parity.
Future relevance: these tests protect future backend integration refactors.
gix-ref: activate backend-agnostic store abstraction (files + reftable state)Motivation: remove hard coupling to file-store without changing behavior yet.
Relates to previous: prepares host crate interface for plugging in reftable.
Future relevance: next commit injects real reftable-backed implementation.
gix-ref: add reftable-backed store adapter and route find/iter/transaction operationsMotivation: make
gix-refactually operate on reftable repositories.Relates to previous: fills the new abstraction with a concrete second backend.
Future relevance:
gixcan now switch backend based on repository configuration.gix: switch RefStore to backend-capable store and detect extensions.refStorage=reftableMotivation: enable end-to-end opening and reading of reftable repos in top-level API.
Relates to previous: consumes backend-capable
gix-refAPIs.Future relevance: unlocks fixing existing tests that currently assert reftable unsupported.
gix: make reference iteration/peeling/fetch update paths backend-agnosticMotivation: remove residual file-only assumptions in critical flows.
Relates to previous: completes runtime behavior for common operations.
Future relevance: ensures future features (e.g., optimizations) won’t regress reftable path.
tests: update reftable open/head expectations and add cross-backend regression coverageMotivation: reflect new supported behavior and guard interoperability paths.
Relates to previous: validates functional integration in
gixpublic workflows.Future relevance: serves as long-term guardrail for both backends.
docs/status: document reftable support, sha256 boundary, and update crate-statusMotivation: finalize user/developer-facing contract and current limitations.
Relates to previous: documents the now-landed behavior.
Future relevance: provides clear baseline for follow-up work (end-to-end SHA-256 in
gix).Per-Commit Validation Rule
For each commit, run the smallest relevant test slice before committing, then run a broader slice periodically:
gix-reftableparity suites,gix-reftargeted tests,gixtargeted repository/reference tests.Commit Message Format Rule
Every commit body will include:
Assumptions
gix-reftablesupports SHA-1 and SHA-256;gixintegration remains SHA-1-only in this batch.