-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: dapr-control-plane OOMKilled when DaprInstance provisioned #135
Comments
To change the resource la request and limits, the only option is to tweak the subscription #77 (comment) unfortunately the memory cannot be made configurable but I will digg into the memory consumption. Do you have a way to reproduce it ? I never experienced such behavior |
All we did was execute the steps above and that reproduced it. I don't think the dapr-control-plane would be affected by any existing pods that had dapr annotations for sidecar injection but maybe you can correct me if I am wrong. We did have a number of pods running that had the annotations during the initialization of the DaprInstance. Do you have an example of how we could use the subscription to tweak the requests and limits in the context of the dapr-control-plane? Or am I mistaken about what you mean? |
it should not as the one that is affected is the dapr-operator and other resources. The dapr control plane only generates the manifest. Maybe the watcher watches too many objects. I'll have a look.
No, I don't but there are a number of examples in the documentation mentioned in the linked comment. |
I've tried to reproduce the issue but I've failed.
But the operator works as expected and does not get OOMKilled:
I don't have any dapr application running so it is not 100% the same test, but for what concern the dapr-kubernetes-operator, it should not matter. |
OK we are going to look into the OLM and see if we can adjust the resources of the dapr-control-plane. While we are doing that, I am curious to know if the dapr-control-plane being killed will cause any issues? IN our case, so far we do see the components in places and the CRDS were deployed (permission issues still exists #136 ) and we are using the dapr components so far without issues. What's your thoughts on this? |
It should jot cause any issue as the role of the operator ia just to setup dapr and be sure the setup is in sync with the DaprInstance spec |
Hi @lburgazzoli. Using subscriptions in OLM we were able to stabilize the dapr-control-plane pod. Here is the subscription we used for future reference if others run into this issue.
As a side note, this did not resolve the propagation to the roles. We still need a admin to manually create roles for us to use these CRDs. |
@ryorke1 I would really love to be able to reproduce it so I can fix the real problem (which maybe it is just about increasing the memory) so if at any point you have a sort of reproducer, please let me knoe |
Expected Behavior
dapr-control-plane pod should remain stable and have configurable resource limits and requests.
Current Behavior
The dapr-control-plane pod is continuously being OOMKIlled as long as there is a DaprInstance created. If we remove the DaprInstance, the pod stablizes. The dapr-control-plane pod does seem to survive long enough to deploy the DaprInstance pods and CRDs but it takes a few OOMKills to complete. The pod still continues to crash but doesn't seem to affect the Dapr components.
Possible Solution
Steps to Reproduce
Environment
OpenShift: RedHad OpenShift Container Platform 4.12
Dapr Operator: 0.0.8 with 1.13.2 Dapr components
The text was updated successfully, but these errors were encountered: