Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serve] Introduce scale_to_0_delay_seconds for autoscaler #4532

Open
gaocegege opened this issue Jan 5, 2025 · 0 comments
Open

[Serve] Introduce scale_to_0_delay_seconds for autoscaler #4532

gaocegege opened this issue Jan 5, 2025 · 0 comments

Comments

@gaocegege
Copy link

We’ve introduced downscale_delay_seconds, which lets you adjust the scaling delay. In many cases, we want to scale efficiently from 1 to many or many to 1, but we might want a bit more of a delay when scaling all the way down to 0. Many Kubernetes auto scalers have a separate setting for scaling to 0.

For example, you could set downscale_delay_seconds to 1200s and scale_to_0_delay_seconds to 3600s to fine-tune the scaling's behavior.

Version & Commit info:

  • sky -v: PLEASE_FILL_IN
  • sky -c: PLEASE_FILL_IN
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant