Skip to content

Conversation

@seiflotfy
Copy link
Member

instead of 1 big heap we now split he heap in n (n=6) heaps to and assign elements to to one of the heaps based on the hash. This way we reduce switching on heavy hitters and reduce merge errors

instead of 1 big heap we now split he heap in n (n=6) heaps to and
assign elements to to one of the heaps based on the hash. This way
we reduce switching on heavyhitters and reduce merge errors
Copy link

@mschoch mschoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I spotted a few mechanical things noted here. And my understanding is that you have replaced the single map/heap with N partitioned map/heaps, where the hash of the key (which you computed anyway) is used to choose the partition. I will wait for your explanation to review again with that in mind.

topk.go Outdated
var (
err error
)
for i := 0; i < 6; i++ {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be nPartitions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep

topk.go Outdated
if len(s.k.elts) < s.n {
// there is free space
if len(s.p[i].elts) < s.n {
// there is free sp[i]ace
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Search/Replace error or some deep joke I don't get?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a joke :P

idx2, ok2 := other.p[i].m[k]
xhash := reduce(metro.Hash64Str(k, 0), len(s.alphas))
min1 := other.alphas[xhash]
min2 := other.alphas[xhash]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

min1 and min2 are always assigned the same value, and they appear to only be read inside this scope. Is there some reason we need both?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants