Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache Node Assignment Service #958

Merged
merged 4 commits into from
Jul 26, 2024

Conversation

kx-chen
Copy link
Contributor

@kx-chen kx-chen commented Jun 20, 2024

Summary

This PR adds a basic draft of the cache node assignment service + tests. Several other changes were added in order to support the service, they include:

  • Adding CacheNodeAssignment and CacheNodeMetadata classes (plus serializers, ZK store classes, proto file changes)
  • Having the CachingChunkManager listen for changes in assignments and download data as appropriate

Requirements

@kx-chen kx-chen force-pushed the kx-chen/cache-node-assignment branch 3 times, most recently from 75e651e to da142b3 Compare June 20, 2024 21:44
@mansu
Copy link
Contributor

mansu commented Jun 21, 2024

We already have code that does cache node assignment. How is this different from current logic?

It seems upon closer reading of the code, we are adding dynamic chunks. It would be good to clarify the design intent here.

If we want to support more chunks per cache node, why not increase the number of slots dynamically? Why do we need to change chunk assignment?

@kx-chen
Copy link
Contributor Author

kx-chen commented Jun 21, 2024

Hi Suman! Great to see you. Good question!

Yes, we are adding dynamic chunks. However, the goal is not just to support more chunks per cache node but also to improve disk space utilization.

I'm sure you already know, but chunks are not always a fixed size. While we aim for uniform size, they can vary. Since slots assume each chunk is the same size, this can lead to under-utilization of disk space if chunks are smaller than the assumed size.

For example, with a 3TB disk and 15GB per slot, we get 200 slots. If a chunk is only 10GB, there’s still 5GB of unused space allocated for that slot.

The change in assignment algorithm addresses this. We plan to use a bin-packing algorithm to pack all chunks into the least number of cache nodes, optimizing disk space usage and reducing costs.

Let me know if that answers your question! And apologies for the lack of context in the PR—it's still very much a work in progress 😅

@mansu
Copy link
Contributor

mansu commented Jun 28, 2024

Hi Kai, great to see you again!

Thanks for the context. It would ideal to write a design doc or a discussion for this before we start coding to document the goals of this change.

It seems that the goal of this change is improve disk utilization. This is a great goal. I am however curious why the there is so much wasted disk space unused in practice. If you are using size based roll over, at any given time only a small percentage of chunks would not be the same size as the cache slot but in steady state the chunks should be of the same size. If it is happening regularly, is it not a tuning issue? Also, even if the temporarily pack smaller chunks better(say after a deployment), would we not get back to a the same configuration as a fixed slot config once the roll over happens in an hour? I am asking since I may be missing a case where there are longer term advantages of this change.

Also, the current implementation of static number of slots results employs a constant work paradigm for the entire cluster. This makes cluster operations mechanical, predictable and easy to automate.

The dynamic slot model proposed here is much more dynamic and makes the system less predictable over all. This is big sacrifice from a design perspective.

To carefully evaluate the effect of these changes, it may be better to break this change down into a few smaller changes:

  • The most basic feature needed here is ability to create dynamic slots on cache nodes instead of fixed slots.
  • Once we can dynamically change number of slots, the next feature to consider is if we should allow nodes other than cache node(ex cluster manager) to create the slots.
  • There may be a lot of ways in which we can optimize the cache assignment algorithm. So, we should probably make the assignment algorithm change independent of these dynamic nodes.

It is possible that we can get back disk utilization after the first 2 changes. It seems we are also changing the cache assignment nodes in this PR. I think that change may be better done independent of this change.

Overall, I think it may be ideal if we approach this change in a more incremental manner.

@kx-chen kx-chen force-pushed the kx-chen/cache-node-assignment branch from da142b3 to 44f1592 Compare July 1, 2024 22:53
@kx-chen kx-chen requested a review from bryanlb July 1, 2024 22:54
@kx-chen kx-chen force-pushed the kx-chen/cache-node-assignment branch 3 times, most recently from 99baf84 to 7c90558 Compare July 2, 2024 23:06
@kx-chen kx-chen force-pushed the kx-chen/cache-node-assignment branch 5 times, most recently from 75e8c77 to 5e26cf0 Compare July 22, 2024 22:06
@kx-chen kx-chen requested a review from bryanlb July 22, 2024 22:22
@kx-chen kx-chen marked this pull request as ready for review July 22, 2024 22:22
@kx-chen kx-chen changed the title [DRAFT] Cache Node Assignment Service Cache Node Assignment Service Jul 22, 2024
Copy link
Contributor

@bryanlb bryanlb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good! This is a pretty huge change, but you've done a good job of isolating the new behavior behind a "default false" system property.

I took a look over the code coverage as well, and it looks like you did a decent job of exercising the new paths. I think with a couple small tweaks this is probably good to merge up.

@kx-chen kx-chen force-pushed the kx-chen/cache-node-assignment branch 4 times, most recently from 1d2075c to a660974 Compare July 25, 2024 16:27
@kx-chen kx-chen requested a review from bryanlb July 25, 2024 16:28
@kx-chen kx-chen force-pushed the kx-chen/cache-node-assignment branch 3 times, most recently from ace315d to 01852ca Compare July 25, 2024 23:11
@kx-chen kx-chen force-pushed the kx-chen/cache-node-assignment branch 2 times, most recently from ad99d00 to 9a17959 Compare July 25, 2024 23:48
@kx-chen kx-chen force-pushed the kx-chen/cache-node-assignment branch from 9a17959 to 6c0c33e Compare July 25, 2024 23:58
Copy link
Contributor

@bryanlb bryanlb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple nits, but I would consider addressing them in a future PR. I'm not seeing anything that sticks outs that we haven't discussed, and you have already successfully deployed this to several test environments.

@@ -128,6 +171,142 @@ public ReadOnlyChunkImpl(
LOG.debug("Created a new read only chunk - zkSlotId: {}", slotId);
}

/*
======================================================
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: These types of comments tend to go stale pretty quick, as folks refactor and miss a comment block several methods up when they're adding new methods.

snapshotMetadataStore.close();
searchMetadataStore.close();

LOG.debug("Closed chunk");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This would be more useful if it contained information about what chunk was closed, otherwise this isn't much different than a metric.

@kx-chen kx-chen merged commit f662fbd into slackhq:master Jul 26, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants