-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v2]: How to deal with node deletion/eviction #519
Comments
I can't delete a linstorsatellite at all, even if I shrink the satelliteset... Were you able to delete one? |
Deleting the finalizer will remove them from the operator memory. You would then need to manually run |
They keep reappearing even after deleting the satellite and "node lost" What's the normal way to delete a node/satellite/pod/etc? |
It will appear again, as long as:
|
Are these "and" or "or"? |
Currently also being affected by this issue. |
+1 on this issue. |
What does the Then, can you show the labels on one of the affected nodes? |
I did not have the nodeSelector in linstorcluster, only in satellitesets. Adding it to cluster and restarting all controllers seemed to help, thanks! |
@dimm0 can you share config please? |
|
The nodes have |
@dimm0 can you do it with nodes which does not have "node-role.kubernetes.io/control-plane" label? |
Do you want to exclude the master? |
Yes, that's what I want. I want to exclude all masters, but I don't understand how to do it |
Not possible with the current selector. |
Please open a feature request, should not be too hard to implement :)
Ok, so now you do not have any unexpected satellites anymore? |
Will do!
Finally!!! :) |
|
BTW, I don't fully understand how it currently creates the nodes. I'm just thinking if you add the nodeAffinity, you'll have to rely on k8s anyways. |
Yes, it's currently a (bad) reimplementation of the Kubernetes scheduler. The reason is: we need to support every node having a slightly different Pod spec for the satellite. That is because the Pod spec for a satellite might be different based on:
All of these are features we think are useful, but makes it hard to use a normal DaemonSet. So we need to use raw pods instead. Perhaps we can somehow reuse the logic of the kube-controller-manager when creating the DaemonSet Pods, I need to look into that. |
Currently, the implicit behaviour of the Operator is:
If a LinstorSatellite resource should no longer exist, because it's no longer matching the node labels, etc... delete it.
If a LinstorSatellite resource is deleted, it will be "finalized" by triggering node evacuation and either waiting for the node to be offline and then "lost", or if the node remains online: for all resources to be moved to other nodes.
This has already caused some pain to some users that were not expecting this behaviour. In particular, it makes it hard to "undelete" a satellite should that be desired.
The text was updated successfully, but these errors were encountered: