Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: waku sync full topic support #3275

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

SionoiS
Copy link
Contributor

@SionoiS SionoiS commented Feb 3, 2025

Description

When syncing take into account the topics a node is interested in. Basically sync on the topic set intersection of the 2 nodes.

Omitting topics means wildcard, use all topics.

Also, trim the time range a node will sync based on interest.

@SionoiS SionoiS self-assigned this Feb 3, 2025
Copy link

github-actions bot commented Feb 3, 2025

You can find the image built from this PR at

quay.io/wakuorg/nwaku-pr:3275

Built from 9abc1c1

@SionoiS SionoiS force-pushed the feat--sync-shard-full-support branch from 47e7f58 to 8b0951b Compare February 6, 2025 13:34
@SionoiS SionoiS marked this pull request as ready for review February 6, 2025 15:05
@SionoiS SionoiS requested review from jm-clius and a team February 6, 2025 15:05
@SionoiS
Copy link
Contributor Author

SionoiS commented Feb 10, 2025

Future @SionoiS don't merge yet. Wait for testing. Fixes would be easier without this new code.

Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally direction makes sense to me. Thanks! Indeed agree that this should be merged only after the "simpler" shards support has been tested alongside basic sync functionality.

One admittedly controversial thing to consider (not for this PR): we couple the protocol very tightly to the understanding of a node being connected to a single cluster with multiple shards. This view is correct, for now, but I'm wondering if this network-level understanding of what "cluster" and "shard" represent is appropriate for lower-layer protocols. In other words, if for any reason in future we want to support multiple clusters organically, the Sync protocol will be affected. Another approach would be to simply keep using the fully qualified pubsub topic (cluster+shard or even pubsub topic hash) as dimension alongside the message ID. This will decouple the sync protocol from any particular understanding of how the network sharding design is done.

@@ -34,7 +35,7 @@ type SyncTransfer* = ref object of LPProtocol
peerManager: PeerManager

# Send IDs to reconciliation protocol for storage
idsTx: AsyncQueue[SyncID]
idsTx: AsyncQueue[(SyncID, uint16)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR, but these uint16 makes me think that we need a type Shard* = uint16 somewhere.

@SionoiS
Copy link
Contributor Author

SionoiS commented Feb 11, 2025

... This will decouple the sync protocol from any particular understanding of how the network sharding design is done.

Yes this a very good point! I'm almost done with the content topic support and switching to pubsub topic instead of shard is trivial. I will do that.

@SionoiS SionoiS changed the title feat: waku sync full shard support feat: waku sync full topic support Feb 11, 2025
@SionoiS SionoiS marked this pull request as draft February 11, 2025 17:09
@SionoiS SionoiS marked this pull request as ready for review February 12, 2025 20:37
Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for including the fully qualified pubsub topic. Haven't been able to review in detail yet, but already have one question below to make sure I understand the use of content topic here correctly. :)

@@ -208,29 +208,24 @@ proc mountSharding*(

proc mountStoreSync*(
node: WakuNode,
cluster: uint16,
shards: seq[uint16],
contentTopics: seq[string],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I understand, this means that all participating nodes would need to express a bounded content topic interest to sync on content topic level? In other words, we are not yet building something that will replace a "server-client" model where the service node supports "all" content topics (on set of shards) and only the client expresses an interest over which content topics to sync?

Copy link
Contributor Author

@SionoiS SionoiS Feb 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty list is a wildcard accept all topics.

edit: let me check to be sure
edit2: I fixed a logic flaw in the pre-processing of topics.

Copy link
Contributor

@gabrielmer gabrielmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :)) Amazing feature - thanks so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants