bug 1907211: use europe-west4 instances in all preemptible translatio…

…ns GPU worker pools (#31) Our small test worked out well, and the cost is only marginally more than the US instances. Let's open up some more capacity here. We're getting the GPU quota raised for this region in https://mozilla-hub.atlassian.net/browse/RELOPS-1011, so let's hold off merging this until that's dealt with, on the off chance we don't get the new limit we'd like.
mozilla-releng · Jul 17, 2024 · 64956fa · 64956fa
1 parent c21b995
commit 64956fa
Showing 1 changed file with 14 additions and 5 deletions.
diff --git a/worker-pools.yml b/worker-pools.yml
@@ -1584,7 +1584,10 @@ pools:
             maxTaskRunTime: 2592000
             enableInteractive: true
       minCapacity: 0
-      maxCapacity: 97
+      # We use 4 GPUs per instance across 4 regions with a limit of 128
+      # per region at any given time. 4 regions * 4 GPUs = 512 total GPUs
+      # 512 GPUs / 4 per instance = 128 instances possibly running at once.
+      maxCapacity: 128
       implementation: generic-worker/worker-runner-linux
       regions: [us-central1, us-west1, us-east1, europe-west4]
       image: monopacker-translations-worker
@@ -1615,9 +1618,12 @@ pools:
             maxTaskRunTime: 2592000
             enableInteractive: true
       minCapacity: 0
-      maxCapacity: 96
+      # We use 4 GPUs per instance across 4 regions with a limit of 128
+      # per region at any given time. 4 regions * 4 GPUs = 512 total GPUs
+      # 512 GPUs / 4 per instance = 128 instances possibly running at once.
+      maxCapacity: 128
       implementation: generic-worker/worker-runner-linux
-      regions: [us-central1, us-west1, us-east1]
+      regions: [us-central1, us-west1, us-east1, europe-west4]
       image: monopacker-translations-worker
       instance_types:
         - minCpuPlatform: Intel Skylake
@@ -1713,9 +1719,12 @@ pools:
             maxTaskRunTime: 2592000
             enableInteractive: true
       minCapacity: 0
-      maxCapacity: 96
+      # We use 4 GPUs per instance across 4 regions with a limit of 128
+      # per region at any given time. 4 regions * 4 GPUs = 512 total GPUs
+      # 512 GPUs / 4 per instance = 128 instances possibly running at once.
+      maxCapacity: 128
       implementation: generic-worker/worker-runner-linux
-      regions: [us-central1, us-west1, us-east1]
+      regions: [us-central1, us-west1, us-east1, europe-west4]
       image: monopacker-translations-worker
       instance_types:
         - minCpuPlatform: Intel Skylake