Fix removeUntaggedEc2: check iit-billing-tag before deleting EKS clusters #3587
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Fixes critical logic bug in
removeUntaggedEc2
Lambda where EKS clusters with valid billing tags were being deleted.Problem
The Lambda was deleting EKS clusters that didn't match the skip pattern (
pe-.*
) without checking if they had valid billing tags.Old (broken) logic:
pe-.*
→ Delete entire cluster (ignored billing tag)New (fixed) logic:
iit-billing-tag
→ Keep it (billing tag trumps everything)pe-.*
→ Keep it (protected cluster)Changes
1. New Function:
has_valid_billing_tag()
Validates
iit-billing-tag
values with support for:"pmm-staging"
,"jenkins-pmm-slave"
→ Valid"1759837138"
→ Valid only if timestamp is in future (UTC)2. Updated:
is_eks_managed_instance()
New priority order:
pe-.*
, skip (protected cluster)3. Updated:
is_instance_to_terminate()
Uses
has_valid_billing_tag()
for consistent validation across EKS and regular EC2.Test Coverage
Comprehensive test suite with 12 scenarios in
test_removeUntaggedEc2_logic.py
:Regular EC2:
EKS - Protected (
pe-*
):pe-crossplane
no tag → Kept (skip pattern)pe-infra
with tag → Kept (billing tag)EKS - Non-Protected:
pmm-test
with tag → Kept (billing tag)pmm-temp
with future timestamp → Kept (billing tag)pmm-ha
no tag → Cluster deletedpmm-expired
with expired timestamp → Cluster deletedRun tests:
python3 cloud/aws-functions/test_removeUntaggedEc2_logic.py
Behavior Summary
Regular EC2:
EKS Instances:
pe-.*
→ Skip (protected cluster)Key Points:
pe-.*
whitelists Platform Engineering clustersFiles Changed
cloud/aws-functions/removeUntaggedEc2.py
- Fixed Lambda functioncloud/aws-functions/test_removeUntaggedEc2_logic.py
- Test suite (no AWS connection needed)cloud/aws-functions/README_removeUntaggedEc2.md
- DocumentationDeployment
Before deploying, run validation:
Deploy:
cd cloud/aws-functions zip removeUntaggedEc2.zip removeUntaggedEc2.py aws lambda update-function-code \ --function-name removeUntaggedEc2 \ --zip-file fileb://removeUntaggedEc2.zip \ --region eu-west-1 \ --profile percona-dev-admin