Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(i): Handle collection commits over P2P #3247

Merged

Conversation

AndrewSisley
Copy link
Contributor

@AndrewSisley AndrewSisley commented Nov 15, 2024

Relevant issue(s)

Resolves #3212

Description

Handles the syncing of collection commits over P2P.

Note: The behaviour in TestQueryCommitsBranchables_HandlesConcurrentUpdatesAcrossPeerConnection is currently under discussion and may be changed or linked to #3245 and #3246 before this PR is merged. Discussed, and test updated.

Todo

  • I forgot to update the wait for sync test action and it looks like some of the new tests are flaking in the CI because of this

@AndrewSisley AndrewSisley added feature New feature or request area/p2p Related to the p2p networking system labels Nov 15, 2024
@AndrewSisley AndrewSisley added this to the DefraDB v0.15 milestone Nov 15, 2024
@AndrewSisley AndrewSisley requested a review from a team November 15, 2024 17:08
@AndrewSisley AndrewSisley self-assigned this Nov 15, 2024
@AndrewSisley AndrewSisley force-pushed the 3212-p2p-brnchable-cols branch from 8a603cd to 473b681 Compare November 15, 2024 17:09
Copy link

codecov bot commented Nov 15, 2024

Codecov Report

Attention: Patch coverage is 91.79104% with 11 lines in your changes missing coverage. Please review.

Project coverage is 77.58%. Comparing base (8e24ee4) to head (8a9f166).
Report is 1 commits behind head on develop.

Files with missing lines Patch % Lines
net/server.go 74.19% 6 Missing and 2 partials ⚠️
internal/db/merge.go 95.71% 2 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #3247      +/-   ##
===========================================
+ Coverage    77.54%   77.58%   +0.04%     
===========================================
  Files          382      382              
  Lines        35263    35301      +38     
===========================================
+ Hits         27344    27386      +42     
+ Misses        6291     6290       -1     
+ Partials      1628     1625       -3     
Flag Coverage Δ
all-tests 77.58% <91.79%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
internal/db/messages.go 100.00% <100.00%> (ø)
net/peer.go 76.35% <100.00%> (+0.24%) ⬆️
internal/db/merge.go 82.32% <95.71%> (+1.40%) ⬆️
net/server.go 73.26% <74.19%> (-0.02%) ⬇️

... and 22 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8e24ee4...8a9f166. Read the comment docs.

---- 🚨 Try these New Features:

Copy link
Member

@nasdf nasdf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! The implementation is simple considering the complexity of the feature.

mp, err := db.newMergeProcessor(txn, col, dsKey)
if err != nil {
return err
var mt mergeTarget
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: saves a few lines

var key keys.HeadstoreKey
if dagMerge.DocID != "" {
  key = keys.HeadstoreDocKey{...}
} else {
  key = keys.NewHeadstoreColKey(...)
}

Copy link
Contributor Author

@AndrewSisley AndrewSisley Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol, my bad - thanks for the suggestion Keenan 😁

  • Simplify executeMerge key block

Results: map[string]any{
"commits": []map[string]any{
{
"cid": testUtils.NewUniqueCid("collection"),
"cid": testUtils.NewUniqueCid(0),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: I was playing around here using ints as cid ids and forgot it was an existing test - I still prefer strings, but let me know your thoughts :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the input param needed or could the function generate a random ID in the NewUniqueCid func?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The input param is purely descriptive, it has no relation to the value of the cid that it represents

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although, if I misunderstood you - yes there needs to be a value as otherwise there is no way to describe which values should/shouldn't be the same

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ask because if generated internally the pattern for reuse could be:

cidA := testUtils.NewUniqueCid()
cidB := testUtils.NewUniqueCid()

Just a thought I had, no need to change the implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I guess we could just make it a pointer and check for equality of the reference - would be a better non-descriptive version than using int, but a fair bit more wordy as you'd need to declare the variables if used in multiple places. (I still prefer strings atm though)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The strings definitely make this more comprehensible. I much prefer how it was before.

Copy link
Contributor Author

@AndrewSisley AndrewSisley Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change this test back to strings in a sec, I can revert if anyone comes back very soon before merge in favour of the ints @islamaliev

  • Use strings in test

@AndrewSisley AndrewSisley force-pushed the 3212-p2p-brnchable-cols branch 4 times, most recently from 55177fc to a417d68 Compare November 15, 2024 20:05
@AndrewSisley AndrewSisley force-pushed the 3212-p2p-brnchable-cols branch from a417d68 to d1906b2 Compare November 15, 2024 20:11
@@ -22,7 +22,9 @@ import (
)

func (db *db) handleMessages(ctx context.Context, sub *event.Subscription) {
queue := newMergeQueue()
docIdQueue := newMergeQueue()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: docIDQueue would be more idomatic.

Copy link
Contributor Author

@AndrewSisley AndrewSisley Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 years later and I'm still messing that up 😁

  • docIDQueue

@@ -215,7 +215,12 @@ func addDocActorRelationshipACP(
}

if action.ExpectedError == "" && !action.ExpectedExistence {
waitForUpdateEvents(s, actionNodeID, map[string]struct{}{docID: {}})
expect := make([]map[string]struct{}, action.CollectionID+1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: You prefer creating a CollectionID+1 lenght array that will have just one item instead of passing the collection ID to the waitForUpdateEvents function?

Copy link
Contributor Author

@AndrewSisley AndrewSisley Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started working from the waitForUpdateEvents (and late, as a PR fixup). waitForUpdateEvents was multi-collection friendly and so I preserved that without thinking too hard about it, although I think you are right and that all the callers are dealing with a single collection.

Your suggestion makes a lot of sense :) Although I think I might be able to pass a collection instead of just the index.

  • Simplify waitForUpdateEvents

Copy link
Collaborator

@fredcarle fredcarle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT. Good job Andy. Just. one suggestion to improve readability of the expected result.

schemaVersionKey := keys.CollectionSchemaVersionKey{
SchemaVersionID: mp.col.Schema().VersionID,
CollectionID: mp.col.ID(),
}

if field == "" {
mcrdt = merklecrdt.NewMerkleCompositeDAG(
switch {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Nice use of the block's IsFoo methods.

"cid": testUtils.NewUniqueCid("collection, update3"),
"links": []map[string]any{
{
"cid": testUtils.NewUniqueCid("collection, update2"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: If you keep the node in the string (collection, node0 update2) it would help to know where this update comes from.

Copy link
Contributor Author

@AndrewSisley AndrewSisley Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh true, I dropped that forgetting that the update was actually applied to a single node (at first). Will do, thanks Fred.

  • note update node in cid description

@AndrewSisley AndrewSisley merged commit 27893cf into sourcenetwork:develop Nov 18, 2024
42 of 43 checks passed
@AndrewSisley AndrewSisley deleted the 3212-p2p-brnchable-cols branch November 18, 2024 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/p2p Related to the p2p networking system feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle collection commits across P2P system
3 participants