-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase k8s resources for spatial microservice #4026
base: staging
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughThe changes in this pull request involve updates to the resource limits and requests for the Changes
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## staging #4026 +/- ##
========================================
Coverage 11.73% 11.73%
========================================
Files 114 114
Lines 15331 15331
Branches 319 319
========================================
Hits 1799 1799
Misses 13532 13532 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
k8s/spatial/values-prod.yaml (1)
18-19
: Consider adjusting the memory limit to request ratioThe current configuration shows:
- Memory requests: 550Mi
- Memory limits: 1800Mi
- Ratio: ~3.3:1
While this configuration will work, consider reducing the memory limit or increasing the request to achieve a ratio closer to 2:1 or 2.5:1. This would:
- Improve resource utilization predictability
- Reduce the risk of node pressure during spikes
- Better align with Kubernetes best practices
Suggested adjustments (choose one):
resources: limits: cpu: 1000m - memory: 1800Mi + memory: 1375Mi # 2.5:1 ratio requests: cpu: 400m memory: 550MiOR
resources: limits: cpu: 1000m memory: 1800Mi requests: cpu: 400m - memory: 550Mi + memory: 720Mi # 2.5:1 ratio
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
k8s/spatial/values-prod.yaml
(1 hunks)k8s/spatial/values-stage.yaml
(1 hunks)
🔇 Additional comments (1)
k8s/spatial/values-prod.yaml (1)
15-16
: Verify the significant increase in resource limits
The CPU and memory limits have been increased substantially:
- CPU: 100m → 1000m (10x increase)
- Memory: 350Mi → 1800Mi (~5x increase)
While the PR mentions Mozilla funding support, we should verify:
- Historical resource utilization patterns justifying these increases
- Cost implications of these new limits
✅ Verification successful
Since we couldn't access the Kubernetes metrics directly, let's analyze the historical context and documentation to understand the rationale behind these resource increases. This will help us verify if these changes align with documented requirements or previous patterns.
Let's analyze the staging configuration and recent changes to understand if there's a pattern of increased resource usage or documented performance issues that justify these significant increases.
Resource increases appear reasonable based on staging environment patterns
The significant resource increases in production align with the staging environment configuration, which shows similar resource patterns:
- Staging uses 900m CPU limit vs. 1000m in production
- Staging uses 1200Mi memory limit vs. 1800Mi in production
- Both environments maintain similar CPU limit:request ratios
The incremental increase from staging to production (approximately 10-50% higher) follows a reasonable pattern for production scaling, especially with the higher replica count and load expectations in production.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Retrieve recent resource usage metrics for the spatial service
# Get recent pod metrics
kubectl top pod -n production -l app=airqo-spatial-api --sort-by=memory
# Get recent HPA events
kubectl get events -n production --field-selector involvedObject.kind=HorizontalPodAutoscaler,involvedObject.name=airqo-spatial-api
Length of output: 403
Script:
#!/bin/bash
# Let's try to analyze the git history of the values file to understand the context
# and previous resource configurations
# Show the full history of changes to the values file
git log -p k8s/spatial/values-prod.yaml
# Also check if there are any related configuration files or documentation
fd -t f "values.*ya?ml" k8s/spatial/
# Look for any related documentation about resource requirements
rg -i "resource|memory|cpu" README.md docs/
Length of output: 143
Script:
#!/bin/bash
# Let's compare the staging and production configurations
cat k8s/spatial/values-stage.yaml
# Also check for any recent changes to both files
git log --since="3 months ago" --oneline k8s/spatial/values-stage.yaml k8s/spatial/values-prod.yaml
# Look for any performance-related issues or discussions in recent commits
git log --since="3 months ago" --grep="performance\|memory\|cpu\|resource" --oneline
Length of output: 789
cpu: 900m | ||
memory: 1200Mi | ||
requests: | ||
cpu: 100m | ||
memory: 400Mi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adjust CPU request to maintain reasonable limit:request ratio
The current configuration shows concerning ratios:
- CPU limit:request = 9:1 (900m:100m)
- This is significantly higher than the production ratio of 2.5:1
- High ratios can lead to resource contention and unpredictable performance
- Memory limit:request = 3:1 (1200Mi:400Mi)
- This aligns well with production's ratio
Recommended changes to maintain environment parity and best practices:
resources:
limits:
cpu: 900m
memory: 1200Mi
requests:
- cpu: 100m
+ cpu: 360m # Creates a 2.5:1 ratio, matching production
memory: 400Mi
Additional considerations:
-
The staging environment now has:
- 90% of production's CPU limits
- 67% of production's memory limits
This seems reasonable for a staging environment while maintaining similar characteristics.
-
Different autoscaling configurations:
- Staging: maxReplicas: 2, targetMemoryUtilizationPercentage: 80%
- Production: maxReplicas: 3, targetMemoryUtilizationPercentage: 70%
This aligns with expected staging vs production scaling needs.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
cpu: 900m | |
memory: 1200Mi | |
requests: | |
cpu: 100m | |
memory: 400Mi | |
cpu: 900m | |
memory: 1200Mi | |
requests: | |
cpu: 360m | |
memory: 400Mi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Psalmz777 share your opinion on this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
Description
[Increase memory for spatial microservice]
Related Issues
This microservice was assigned limited space during creation. this PR increases it. @Psalmz777 with regards to cost optimisation OKR. @sserurich has clarified that the mozilla funding should cater for this and incase of need for more. Mozilla is in position to provide more.
Changes Made
Additional Notes
[Add any additional notes or comments here]
Summary by CodeRabbit
airqo-spatial-api
application to enhance performance.airqo-stage-spatial-api
application.