Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move each hub to its own nodegroup on the openscapes cluster #4482

Closed
6 tasks done
Tracked by #4453
yuvipanda opened this issue Jul 23, 2024 · 4 comments · Fixed by #4499
Closed
6 tasks done
Tracked by #4453

Move each hub to its own nodegroup on the openscapes cluster #4482

yuvipanda opened this issue Jul 23, 2024 · 4 comments · Fixed by #4499
Assignees

Comments

@yuvipanda
Copy link
Member

yuvipanda commented Jul 23, 2024

After the outcome of the spike in #4465, we are going to give each hub its own nodepool that is properly tagged to track cost on a per-hub basis.

  • Create the same set of nodes for staging, prod and workshop
  • Configure each of the hubs so their users only spawn on the nodepools designated for them
  • Tag each of the nodepool with 2i2c:hub-name to match the name of the hub
  • Tag each of the notebook nodepools with 2i2c:node-purpose set to user

Definition of done

  • Users across these hubs will spawn on to their own nodepools, and not share them with users on other hubs.
  • There is a tag 2i2c:hub-name and 2i2c:node-purpose on all the nodes spawned when users log on to the hub. You can verify this by looking at EC2 instances on the AWS console.

Trade-offs

Since our health check triggers a user spawn, this means that instead of spawning 1 node when we trigger deploys on all of the hubs, we will trigger 3 separate nodes. This is fine - the autoscaler reclaims them after ~10min, and even with the largest nodes that doesn't cost enough to be a problem.

Out of scope

dask-gateway is out of scope here, and handled by #4485

@yuvipanda yuvipanda changed the title Create separate nodegroups for each hub on the openscapes cluster Create separate nodegroups for staging hub on the openscapes cluster Jul 24, 2024
@yuvipanda yuvipanda changed the title Create separate nodegroups for staging hub on the openscapes cluster Move each hub to its own nodegroup on the openscapes cluster Jul 24, 2024
@sgibson91
Copy link
Member

I think there's a language problem here (and in #4486) of tags vs. labels, both of which exist. As I understand it, tags operate at the cloud vendor level, but labels can be used as selectors at the kubernetes level. If we want pods to be spun up in specific node pools, we definitely want to be using labels. But I don't know if the cost-tracking system we are going to use will be looking at cloud tags or kubernetes labels.

In the end, neither of which are costly to apply so I will probably just do both.

@sgibson91
Copy link
Member

I'm attempting this, but I have no idea where to put the node_selector.2i2c/hub-name: <hub-name> value. I've have to copy the whole profile list of image options out of common values file because kubespawner_override.node_selector is a true override and doesn't merge with singleuser.nodeSelector. Also helm overwriting lists means I can't merge config that what either. #4499 represents what I've tried for staging but it doesn't work.

@yuvipanda
Copy link
Member Author

kubespawner_override.node_selector is a true override and doesn't merge with singleuser.nodeSelector

If they are dictionaries (rather than lists), they should merge (since jupyterhub/kubespawner#650). So your instinct to put it in singleuser.nodeSelector is correct. You can also try hub.config.KubeSpawner.node_selector, although it should be the same as singleuser.nodeselector.

it doesn't work.

Can you provide more detail?

@sgibson91
Copy link
Member

sgibson91 commented Jul 26, 2024

Image

This is using my first instinct to add singleuser.nodeSelector. We're basically not triggering the new node pool(s) at all. Currently deployed config is in #4499

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants