Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AWS] Move ci.jenkins.io from Azure (sponsorship) to AWS (sponsorship) #4313

Open
2 of 8 tasks
dduportal opened this issue Sep 28, 2024 · 3 comments
Open
2 of 8 tasks
Assignees

Comments

@dduportal
Copy link
Contributor

dduportal commented Sep 28, 2024

This issue is a top-level EPIC about ci.jenkins.io in the context of the cloud billing and Jenkins Infrastructure sponsorship.

  • Since Migrate ci.jenkins.io to the sponsored subscription #3913, ci.jenkins.io runs in the Azure sponsored subscription which has around $49k credits valid until May 2025.
    • ci.jenkins.io consumes around $8k monthly in this system, out of the $10k monthly consumed
    • Challenge: we want these credits to last at least until end of April 2025
  • AWS granted a $60k credits ("AWS Sponsorship account") which we weren't able to use until know (missing people to spend time on it), valid until end of January 2025
    • We want to consume as much as possible in these credits for both our benefits and AWS'

As such, we agreed on moving ci.jenkins.io from Azure (sponsored) to AWS (sponsored):

  • It's an autonomous system (no relations or hard links with other systems in the Jenkins infra)
  • It used to run on AWS (CloudBees) account so there should not be a lot of "new" things
  • We can always move it back somewhere else (Azure?) in January if we run out of time or credits

This task is divided in the following distinct topics:

Tasks

Preview Give feedback
  1. dduportal smerle33
  2. EC2 aws ci.jenkins.io
    dduportal
  3. aws ci.jenkins.io
    dduportal smerle33
  4. aws ci.jenkins.io
    dduportal
  5. artifact-caching-proxy aws ci.jenkins.io
    smerle33
  6. triage
  7. triage
  8. aws ci.jenkins.io triage
@dduportal dduportal added this to the infra-team-sync-2024-10-01 milestone Sep 28, 2024
@smerle33 smerle33 changed the title [AWS] Move ci.jenkins.io from Azure (sponsorship) to AWS (sponsorhsip) [AWS] Move ci.jenkins.io from Azure (sponsorship) to AWS (sponsorship) Sep 30, 2024
@dduportal
Copy link
Contributor Author

Update: we have to use us-east2 region (we target the Availability Zone b for spreading out of the first AZ).

Rationale: us-east-1 is the default and legacy region and is subject to a lot more outage than other regions, not mentioning the spot instance availabilities, in our past experience.

Note

In the former Cloudbees AWS account, us-east-1 used to host the "permanent" workload (such as Jenkins controller)
while us-east-2 used to be for ephemeral workloads (such as Jenkins agents).
We do not have such a requirement in the Jenkins AWS account here

@smerle33
Copy link
Contributor

smerle33 commented Dec 11, 2024

we saw this error on our packer-images build : Error launching source instance: VcpuLimitExceeded: You have requested more vCPU capacity than your current vCPU limit of 64 allows for the instance bucket that the specified instance type belongs to. Please visit <a href='http://aws.amazon.com/contact-us/ec2-request'>http://aws.amazon.com/contact-us/ec2-request</a> to request an adjustment to this limit.

so we took the opportunity to ask for a new limit to 2000 (same as today in Azure for x86_64 CPUs) as we need (below numbers are approximate as we rarely reach the maximum everywhere at the same time):

-ci.jenkins.io:

@dduportal
Copy link
Contributor Author

we saw this error on our packer-images build : Error launching source instance: VcpuLimitExceeded: You have requested more vCPU capacity than your current vCPU limit of 64 allows for the instance bucket that the specified instance type belongs to. Please visit <a href='http://aws.amazon.com/contact-us/ec2-request'>http://aws.amazon.com/contact-us/ec2-request</a> to request an adjustment to this limit.

so we took the opportunity to ask for a new limit to 2000 (same as today in Azure for x86_64 CPUs) as we need (below numbers are approximate as we rarely reach the maximum everywhere at the same time):

-ci.jenkins.io:

* 400 max CPUs for VM agents (plugins and ATH): https://github.com/jenkins-infra/jenkins-infra/blob/8129db26f6999be090be33770193c3ea2278a009/hieradata/clients/controller.sponsorship.ci.jenkins.io.yaml#L363

* 1440 max CPUs for container agents (with a bom build): https://github.com/jenkins-infra/azure/blob/0dd69e354c3d764d72763ae3f4b23ec29b67cf69/ci.jenkins.io-kubernetes-agents.tf#L153

* infra.ci.jenkins.io: 200 (50*4): https://github.com/jenkins-infra/kubernetes-management/blob/06d4e7d0354c60470b96c7bc04aa161d1f85ba7c/config/jenkins_infra.ci.jenkins.io.yaml#L233

Good news: AWS support did accept (at the second try) our quota increase request to 2000 CPUs \o/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants