-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected periodical istiod scale down #915
Comments
Hi @vitalii-buchyn-exa. Is there something interesting in the operator's logs during these events? |
@LuciferInLove Hi, nothing i can spot at least, here is couple minutes before and after the issue: for istiod deployment we see events like:
|
Thanks for the information. I'm trying to reproduce it but with no success yet. Did you notice any changes in resources? For example, was the HPA resource changed maybe? |
If it's possible, please try to run the operator with |
sure, |
here is the log of 5 mins before and 5 mins after the issue (timezone of logs is GMT) |
The most interesting part:
Is the gateway address changed every 12 hours? |
i can see a lot of such events in the log
events are like:
|
shouldn't be there any connection between istiod deployment and istiomeshgateway deployments reconciling? |
|
Here is an example of IstioControlPlane CR: |
These settings can't affect such behavior. Otherwise, it would be reproducible everywhere, including our test envs. Unfortunately, there are no clues in the IstioControlPlane. I tried this ICP with some modifications according to my env, and there were no downscales for 16 hours. Another interesting string in the logs: 2023-04-18T00:00:20.039Z DEBUG controllers.IstioControlPlane.discovery resource diff
{"name": "istiod-istio2", "namespace": "istio-system", "apiVersion": "apps/v1", "kind": "Deployment",
"patch": "{\"spec\":{\"template\":{\"metadata\":{\"labels\":{\"exa_service\":\"ecp-inf-istio\"}}}}}"} Could something be causing these periodic label changes? |
@LuciferInLove you are genius, Sir, there is indeed label injector cronjob with schedule Thank you for your time and help! |
@vitalii-buchyn-exa no problem. Feel free to reach out. |
Describe the bug
Hello community,
We are experiencing periodical unexpected istiod deployment scale down that is far below an HPA recommendations and below a minimum replicas count value.
It happens every 12 hour in our case.
That behaviour leads to a control plane traffic amount increase and thus for an outbound traffic cost' increase for us.
If we stop an operator (scale its deployment down to 0 replicas) we don't see such a behaviour.
Example of replicas spec
Please also find an example screenshot attached.
Please let me know if any additional info is required.
Thank you!
Expected behavior
To not have such a drastic scale down. Rollout restart can be used instead.
Screenshots
![image](https://user-images.githubusercontent.com/52450278/231436024-9f56316d-f989-4d0f-8aea-d2338cc15159.png)
Additional context
Platform 1.23.16-gke.1400
Operator version v2.16.1
Pilot version 1.15.3
The text was updated successfully, but these errors were encountered: