-
-
Notifications
You must be signed in to change notification settings - Fork 360
Open
Labels
C-integrate-gitoxide"Oxidize" crates even more by replacing git2 with gitoxide"Oxidize" crates even more by replacing git2 with gitoxide
Description
Here is the repository: https://app.radicle.xyz/nodes/seed.radicle.xyz/rad:z6cFWeWpnZNHh9rUW8phgA3b5yGt
Required Features
- Remove code-duplication in
heartwood
(see this comment for all the details on where to look; also check the server side of things for reference)
To pick up a draggable item, press the space bar. While dragging, use the arrow keys to move the item. Press space again to drop the item in its new position, or press escape to cancel.To pick up a draggable item, press the space bar. While dragging, use the arrow keys to move the item. Press space again to drop the item in its new position, or press escape to cancel.
More Analysis TBD
Preferences
- crate
git-ref-format
- typesafe/statically analysed ref handling - crate
radicle-git-ext
for better tools to work with commits
Notes
The repository contains a lot of extra-functionality that also isn't present in git2
. This should probably be available from gix
as long as it's not Radicle specific.
- Re-implement some basic features that were lacking in
git2
ingix
- Use
gix
where possible, slowly replacinggit2
- Remove
git2
once there is nothing left.
shanesvellercloudhead, FintanH, lorenzleutgeb, Limeth and theoparis
Metadata
Metadata
Assignees
Labels
C-integrate-gitoxide"Oxidize" crates even more by replacing git2 with gitoxide"Oxidize" crates even more by replacing git2 with gitoxide
Type
Projects
Milestone
Relationships
Development
Select code repository
Activity
FintanH commentedon Mar 21, 2024
This is exciting!
Something I'd like to point out is that replacing
git2
functionality withgix
wouldn't be my largest concern. While a lot of work, I imagine it would be straight forward.The heaviest lifting and biggest concern for a first task is the fact that we're still using the
Delegate
approach for the fetch protocol. I did look at using the more modern approach that Gitoxide was using, however, iirc it didn't provide enough control for us to use the current staged approach to fetching from one remote to the other. If this is something that I could go into more detail about, please let me know.Byron commentedon Mar 21, 2024
Thank you for clarifying that. From my experience, using
gix
in the frontend has advantages in terms of compatibility at the very least, but I also hear that recentlylibgit2
really ramped up its contributions so these issues might even go away in the mid-term.And I'd definitely love to finally come up with a fetch-API that is easy to use but not unnecessarily limiting, and can thus work for you as well. If that was the case, I think you could start tracking a more recent version which might ultimately pay off.
If you would link the latest code that uses the
Delegate
here, I should be able to see what can or can't be done with higher-level APIs and fix these. Sharing what it specifically was that prevented the adoption last time you tried would probably certainly be helpful to me as well.Thanks again!
FintanH commentedon Mar 26, 2024
Compatibility in which sense? :) In my mind, the
radicle-fetch
code is quite isolated and the only conversion points are OIDs and refnames, which are already in place.Interesting. I can have a look again because, admittedly, it was a while that I looked and then got swept away by other tasks.
I should have taken notes when I explored trying to use the updated version, but unfortunately my foresight is not 20/20 😅
So the main entry point to using the gix fetch code are the following helpers:
handshake
ls_refs
fetch
The
ls_refs
function calls into theDelegate
found here -- a lot of the code is a modified version of thegix
code. The same goes forfetch
(here).I think the best way to understand how we use the delegate approach is by describing the high-level flow. The important context is that there are two, special
rad
references:rad/id
-- this reference points to a repository's identity which contains important information about the repository, in this case the delegates of the repository. A delegate being an entity that manages the repository and is a trusted peer for the context of the repository.rad/sigrefs
-- each peer who contributes to the repository has arad/sigrefs
. It points to a signed payload of that peer's reference state, key/value pairs of reference and SHAs, e.g.refs/heads/main deadbeef
,refs/cobs/xyz.radicle.patch/1234 beefdead
, etc. The pairing of the signature and payload allows the protocol to cryptographically verify a peer's set of refs and gives us the ground truth for that peer.With this in mind, the protocol is essentially a series of stages where we fetch a set of this data and ensure some data validity. The state is kept in a type called
FetchState
and each stage is executed via it's methodFetchState::run_stage
.The stages are run in the following order:
rad/id
-- we fetch therefs/rad/id
so that the delegates can be identified and their references are included in the fetch.rad/sigrefs
-- we fetch this reference for each delegate and, depending on our configuration, each peer that we are following.rad/sigrefs
-- we fetch the references that are listed in each peer'srad/sigrefs
. Note that we have the SHAs so we can effectively calculate the wants/haves. Also, in some cases we also have therad/sigrefs
SHA since it can be announced as part of gossip so we can ask for its SHA directly too.All of this happens over a long-lasting connection, so we only perform the handshake step once. The handshake payload is passed down to each of the
ls_refs
andfetch
calls.The protocol finishes by signalling to the other end that it's done and it can cease sending upload-packs. The fetcher can then validate all the data it received checking signatures and ensuring that the data is in a consistent state, applying all the updates to the refdb if it's all consistent.
I hope that makes sense, but please let me know if I can elaborate on any points!
Byron commentedon Mar 26, 2024
Thanks a lot for writing all this down! All this sounds familiar, particularly the multi-step process of downloading packs for different refs. If I remember correctly, the server-side is a plain git server over QUIC transport, which supports git protocol V2. That allows to use a single connection for multiple requests/commands, which are tuned according to the needs, with each stage informing what the next stage can or should do.
My goal here would be to see how I can transform the code away from the Delegate approach to the new command-oriented API, and I think I could consider that successful once the test-suite passes again. This also means that
gix
probably isn't the right abstraction for now, as it's way too high-level, and I don't know if it should be able to support such a specialised while plumbing exists.Once successful, this should allow you to track the latest versions of these plumbing crates, and since I do it I would make the API adjustments necessary to support this case as well. This probably also means you don't have to recheck the
gix
level code of the fetch API, even I don't like to look at it, it's so much and quite complex, always troublesome to find anything 😅.I probably won't get to it very soon, but it's on my list now and I will make it priority once the last stage of
gix status
is implemented (HEAD->index diff).FintanH commentedon Mar 27, 2024
I should have also linked the
upload-pack
side of things. Here's the file. It reads and writes from a set of channels, which act as the intermediary for the QUIC connection, afair (@cloudhead might be able to say more to that).Aye, that makes sense and resonates with my memory of trying to use
gix
way back :)Hahaha thank you for saving me that time up-front 😄
Sounds good! Ping me if there's anything I can help with when you get around to it.
cloudhead commentedon Apr 3, 2024
We're no longer using QUIC, we're using a custom framed protocol over TCP, though I don't think it's so relevant since the fetch code just works with generic writers.
Byron commentedon Apr 4, 2024
Byron commentedon Oct 18, 2024
And as promised, I am on it now and would expect some progress every day from hereon. Here is my approach:
gix-protocol
tests and port them over to the new system. This should help me to warm up with it, given that you would want exactly the same.heartwood
codebase. Since a lot of code like you said has been copied and altered, I hope that this transfers to the new delegate-free model. If not, it should be possible to add the knobs you need to the crate.heartwood
repository, at least at first.The only fear I am having is that the other end might not be a standard Git, so maybe some assumptions about the standard Git protocol don't hold. But if they do, and I'd think that Git protocol V2 is used over a persistent connection, then I'd think there will be no issues that can't be overcome.
I will keep writing here with updates.
FintanH commentedon Oct 18, 2024
Amazing, I'd be interested in seeing what that looks like if you could point me to it while you're looking at it. Just so I have an understanding and can better judge your help when it comes to the
heartwood
repository!That's great, one thing I'll note is that
radicle-dev/heartwood
is currently archived and is behind. @lorenzleutgeb has mentioned that he's keeping an up-to-date fork here https://github.com/lorenzleutgeb/heartwood. Otherwise, the user guide should help you getting set up and I'm always happy to arrange a call to help get you into a Radicle workflow 😎Iiuc, the "other end" is actually using
git-upload-pack
wired up to some customRead
/Write
channels that do the p2p tunneling for us. So it is talking standard Git. Again, happy to do a call and run through any of the code base with you – I've written a lot of it myself and I'm sure it'll be fresh-ish in my memory :)Thanks for helping out ❤️
Byron commentedon Oct 18, 2024
After taking a first look I realized the reason for me not already removing
gix_protocol::fetch()
is that it would now have to be rewritten as some sort of state-machine, similar to what's done ingix
but with all the delegated parts abstracted. Interestingly, if done right, it can be used to ingix
as well.It does sound like something painful, too, but I think there is no way around it as
gix-protocol
ought to abstract and implement exactly this.A more 'focussed' approach would be to look at
heartwood
first, which I think I will do tonight just to have a better understanding of what I am dealing with.I will be very likely to take you up on this once I know what I'd want to do. For now, I think the work has to be done in
gitoxide
as it can use 'the final' cleanup :D - then you can see what it is and probably have ideas how that would (or would not) fit intoheartwood
. Then we can take it from there, including going all in on theradicle
workflow :).Took me long enough 😅.
gix-protocol::fetch()
without Delegates #1634