Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrepancy between the probabilistic expectation from a peer and the documentation #997

Open
vicnetto opened this issue Oct 25, 2024 · 1 comment

Comments

@vicnetto
Copy link

According to the documentation, a node can be removed from the routing table if it has “not been useful within the time period during which they are probabilistically expected to have been utilized in a refresh”. This duration is calculated using the following equation in the documentation:

$\text{maxLastSuccessfulOutboundThreshold} = \log(1/K) \cdot \log\left(1 - \frac{\alpha}{K}\right) \cdot \text{refreshPeriod}$

However, in the implementation found in dht.go, the equation is slightly modified to:

$\text{maxLastSuccessfulOutboundThreshold} = \log(1/K) / \log\left(1 - \frac{\alpha}{K}\right) \cdot \text{refreshPeriod}$

go-libp2p-kad-dht/dht.go

Lines 319 to 335 in 18a758c

var maxLastSuccessfulOutboundThreshold time.Duration
// The threshold is calculated based on the expected amount of time that should pass before we
// query a peer as part of our refresh cycle.
// To grok the Math Wizardy that produced these exact equations, please be patient as a document explaining it will
// be published soon.
if cfg.Concurrency < cfg.BucketSize { // (alpha < K)
l1 := math.Log(float64(1) / float64(cfg.BucketSize)) // (Log(1/K))
l2 := math.Log(float64(1) - (float64(cfg.Concurrency) / float64(cfg.BucketSize))) // Log(1 - (alpha / K))
maxLastSuccessfulOutboundThreshold = time.Duration(l1 / l2 * float64(cfg.RoutingTable.RefreshInterval))
} else {
maxLastSuccessfulOutboundThreshold = cfg.RoutingTable.RefreshInterval
}
// construct routing table
// use twice the theoritical usefulness threhold to keep older peers around longer
rt, err := makeRoutingTable(dht, cfg, 2*maxLastSuccessfulOutboundThreshold)

After calculating this value, the result is multiplied by two because “IPFS defines useful as responding within 2x the time it takes any other peer from our routing table to respond to us.”

As a result, the difference between the expectation from a peer in the implementation and the documentation is:

Implementation: 1h26m26.313713864s
Documentation: 41m31.780054951s
Difference: ~45m

Is this discrepancy intended? Where can I find an explanation for this calculation? Did I miss something?

Since I was unsure whether it was a documentation or implementation issue, I decided to post directly in this repository.

@guillaumemichel
Copy link
Contributor

Thanks @vicnetto for spotting this!

  • maxLastSuccessfulOutboundThreshold was introduced in go-libp2p-kad-dht on 2020-05-19.
  • Documentation was added later on 2020-08-19.

I am not sure where this calculation comes from. It is probably a typo in the docs. maxLastSuccessfulOutboundThreshold is expected to grow as the bucket size (K) grows (see plot below), because it is less likely to hit each individual peer during the refresh. Hence the code is correct, and the doc is wrong: it should be l1 / l2 and not l1 * l2.

output

I have opened ipfs/ipfs-docs#1946 to address the docs discrepancy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants