Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deployer: add --skip-refresh flag to deploy command #4141

Merged
merged 3 commits into from
Jul 4, 2024

Conversation

AIDEA775
Copy link
Contributor

This is something I wrote some time ago and now I will open a PR to discuss.

The deployer deploy command runs helm dep up several times to ensure that the Helm charts and their dependencies are up-to-date.

I think this is okay, but if I need several deployments in the same day, it's not worth running the update each time.

So this PR adds a flag --skip-update to skip that phase and run the validation directly.

Example:

$ time deployer deploy projectpythia staging
Getting updates for unmanaged Helm repositories...
...Successfully got an update from the "https://2i2c.org/binderhub-service/" chart repository
...Successfully got an update from the "https://jupyterhub.github.io/helm-chart/" chart repository
Saving 2 charts
Downloading jupyterhub from repo https://jupyterhub.github.io/helm-chart/
Downloading binderhub-service from repo https://2i2c.org/binderhub-service/
Deleting outdated charts
Getting updates for unmanaged Helm repositories...
...Successfully got an update from the "https://helm.dask.org/" chart repository
Saving 2 charts
Downloading dask-gateway from repo https://helm.dask.org/
Deleting outdated charts
Getting updates for unmanaged Helm repositories...
...Successfully got an update from the "https://cryptnono.github.io/cryptnono/" chart repository
...Successfully got an update from the "https://kubernetes.github.io/ingress-nginx" chart repository
...Successfully got an update from the "https://kubernetes.github.io/autoscaler" chart repository
...Successfully got an update from the "https://grafana.github.io/helm-charts" chart repository
...Successfully got an update from the "https://prometheus-community.github.io/helm-charts" chart repository
Saving 5 charts
Downloading prometheus from repo https://prometheus-community.github.io/helm-charts
Downloading grafana from repo https://grafana.github.io/helm-charts
Downloading ingress-nginx from repo https://kubernetes.github.io/ingress-nginx
Downloading cluster-autoscaler from repo https://kubernetes.github.io/autoscaler
Downloading cryptnono from repo https://cryptnono.github.io/cryptnono/
Deleting outdated charts
1 / 1: Validating non-encrypted hub values files for staging...
Added new context arn:aws:eks:us-west-2:590183926898:cluster/projectpythia to /tmp/tmprdzpqsaa
Deploying hub staging...
Running helm upgrade --install --create-namespace --wait --namespace=staging staging /mnt/stuff/Documentos/CCAD/2i2c/2i2c-infrastructure/helm-charts/basehub --values=/mnt/stuff/Documentos/CCAD/2i2c/2i2c-infrastructure/config/clusters/projectpythia/common.values.yaml --values=/mnt/stuff/Documentos/CCAD/2i2c/2i2c-infrastructure/config/clusters/projectpythia/staging.values.yaml --values=/tmp/tmp1dwrt2me
Release "staging" has been upgraded. Happy Helming!
NAME: staging
LAST DEPLOYED: Mon May 27 18:09:55 2024
NAMESPACE: staging
STATUS: deployed
REVISION: 5
TEST SUITE: None

________________________________________________________
Executed in  143,53 secs    fish           external
   usr time   18,31 secs  887,00 micros   18,31 secs
   sys time    2,51 secs  272,00 micros    2,50 secs
$ time deployer deploy projectpythia staging --skip-update 
1 / 1: Validating non-encrypted hub values files for staging...
Added new context arn:aws:eks:us-west-2:590183926898:cluster/projectpythia to /tmp/tmplxtcjxxv
Deploying hub staging...
Running helm upgrade --install --create-namespace --wait --namespace=staging staging /mnt/stuff/Documentos/CCAD/2i2c/2i2c-infrastructure/helm-charts/basehub --values=/mnt/stuff/Documentos/CCAD/2i2c/2i2c-infrastructure/config/clusters/projectpythia/common.values.yaml --values=/mnt/stuff/Documentos/CCAD/2i2c/2i2c-infrastructure/config/clusters/projectpythia/staging.values.yaml --values=/tmp/tmpv0eh856_
Release "staging" has been upgraded. Happy Helming!
NAME: staging
LAST DEPLOYED: Mon May 27 18:11:09 2024
NAMESPACE: staging
STATUS: deployed
REVISION: 6
TEST SUITE: None

________________________________________________________
Executed in   69,92 secs    fish           external
   usr time   10,53 secs  581,00 micros   10,53 secs
   sys time    1,41 secs  175,00 micros    1,41 secs

@AIDEA775 AIDEA775 requested a review from a team May 27, 2024 21:15
@AIDEA775 AIDEA775 self-assigned this May 27, 2024

This comment was marked as outdated.

@consideRatio
Copy link
Contributor

Nice! Also if you want to deploy in parallell from separate terminal sessions at the same time, they can interfere with each other if they do the update step - with this flag, it could be made safe to do in parallell without waiting for the update sequence to complete.

@consideRatio
Copy link
Contributor

Suitable PR to use the deployer:skip-deploy label on i think - as this doesn't change any hub and running tests just spends cloud resources and energy. I'll add it, feel free to remove if you disagree!

I'll review the implementation tomorrow morning

@consideRatio consideRatio added the deployer:skip-deploy Skips deployment of anything (support, staging, prod) label May 27, 2024
@yuvipanda yuvipanda removed the deployer:skip-deploy Skips deployment of anything (support, staging, prod) label May 27, 2024
@yuvipanda
Copy link
Member

Thank you very much for working on this, @AIDEA775!

@consideRatio I removed that label as I'd like to make sure this doesn't fundamentally change behavior on CI. I would rather have us catch issues now than much later (as happened in https://2i2c.slack.com/archives/C055A1J1DRP/p1716337516500199, where we caught many failing hubs about 9 days after they had started failing)

Copy link
Member

@yuvipanda yuvipanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

helm update has a --skip-refresh flag, which skips the remote network based updates that are done - which is what slows everything down. But, it does the thing that may cause us problems - which is to pick up changes in basehub/values.yaml if we change them, and use daskhub.

So I suggest that:

  1. This flag be renamed from --skip-update to --skip-refresh, to match the helm dep command
  2. in https://2i2c.slack.com/archives/C04994X0BPC/p1716293361972159, --skip-refresh is passed unconditionally to the second two dep up commands, as one refresh is more than enough
  3. If this command is passed to deployer, then pass --skip-refresh to the first dep up command as well.

Thanks for working on this, @AIDEA775! This is a papercut that's bothered me a lot too (and sometimes i'll just commen out these dep up calls). I want to make sure that the priority in the sprint should still be to make sure we finish all the things we had previously committed to, so you can continue work on this at a slower pace if needed.

@yuvipanda
Copy link
Member

@AIDEA775 anything I can do to help move this forward?

@AIDEA775 AIDEA775 force-pushed the deployer-skip-update branch from 3bd4fa4 to 8839e90 Compare July 2, 2024 02:25
@AIDEA775
Copy link
Contributor Author

AIDEA775 commented Jul 2, 2024

Passing --skip-refresh to all helm dep up, still doing some downloading:

deployer deploy ....
Saving 3 charts
Downloading jupyterhub from repo https://jupyterhub.github.io/helm-chart/
Downloading binderhub-service from repo https://2i2c.org/binderhub-service/
Downloading dask-gateway from repo https://helm.dask.org/
Deleting outdated charts
Saving 1 charts
Deleting outdated charts
Saving 5 charts
Downloading prometheus from repo https://prometheus-community.github.io/helm-charts
Downloading grafana from repo https://grafana.github.io/helm-charts
Downloading ingress-nginx from repo https://kubernetes.github.io/ingress-nginx
Downloading cluster-autoscaler from repo https://kubernetes.github.io/autoscaler
Downloading cryptnono from repo https://cryptnono.github.io/cryptnono/
Deleting outdated charts
1 / 1: Validating non-encrypted hub values files for staging...
...

I tried to add the --skip-refresh to the seconds two dep up, but the pipeline failed with:

Error: no cached repository for helm-manager-a3d71d79da3a7e97fc44127f373aa765c1b71499cb48cdf46a0c10608bd437bc found. (try 'helm repo update'): open /home/runner/.cache/helm/repository/helm-manager-a3d71d79da3a7e97fc44127f373aa765c1b71499cb48cdf46a0c10608bd437bc-index.yaml: no such file or directory

I think it's better to maintain the if statement to entirely skip the call to _prepare_helm_charts_dependencies_and_schemas.

@yuvipanda
Copy link
Member

alright, i think this is a good enough improvement that we should merge it. Thanks @AIDEA775.

@yuvipanda yuvipanda changed the title deployer: add --skip-update flag to deploy command deployer: add --skip-refresh flag to deploy command Jul 4, 2024
@yuvipanda yuvipanda merged commit 3f39a80 into 2i2c-org:main Jul 4, 2024
38 checks passed
Copy link

github-actions bot commented Jul 4, 2024

🎉🎉🎉🎉

Monitor the deployment of the hubs here 👉 https://github.com/2i2c-org/infrastructure/actions/runs/9786721523

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants