Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v2]: How to deal with node deletion/eviction #519

Open
WanzenBug opened this issue Aug 10, 2023 · 21 comments
Open

[v2]: How to deal with node deletion/eviction #519

WanzenBug opened this issue Aug 10, 2023 · 21 comments
Labels
enhancement New feature or request v2 This affects only Operator v2

Comments

@WanzenBug
Copy link
Member

Currently, the implicit behaviour of the Operator is:

If a LinstorSatellite resource should no longer exist, because it's no longer matching the node labels, etc... delete it.
If a LinstorSatellite resource is deleted, it will be "finalized" by triggering node evacuation and either waiting for the node to be offline and then "lost", or if the node remains online: for all resources to be moved to other nodes.

This has already caused some pain to some users that were not expecting this behaviour. In particular, it makes it hard to "undelete" a satellite should that be desired.

@WanzenBug WanzenBug added enhancement New feature or request v2 This affects only Operator v2 labels Aug 10, 2023
@dimm0
Copy link

dimm0 commented Aug 11, 2023

I can't delete a linstorsatellite at all, even if I shrink the satelliteset... Were you able to delete one?
Those have a piraeus.io/satellite-protection finalizer, I could remove those manually, but what's the best way to delete the unused ones?

@WanzenBug
Copy link
Member Author

Deleting the finalizer will remove them from the operator memory. You would then need to manually run linstor node lost <nodename> if the node still exists in LINSTOR to get them removed.

@dimm0
Copy link

dimm0 commented Aug 17, 2023

They keep reappearing even after deleting the satellite and "node lost"

What's the normal way to delete a node/satellite/pod/etc?

@WanzenBug
Copy link
Member Author

It will appear again, as long as:

  • The node exists in Kubernetes (kubectl get nodes)
  • The node is not excluded by the spec.nodeSelector on the LinstorCluster resource.

@dimm0
Copy link

dimm0 commented Aug 18, 2023

Are these "and" or "or"?
The nodes are now excluded, but still exist in the cluster

@kkrick-sdsu
Copy link

Currently also being affected by this issue.

@mfarley-sdsu
Copy link

+1 on this issue.

@WanzenBug
Copy link
Member Author

What does the LinstorCluster resource look like? Does it have a spec.nodeSelector set?

Then, can you show the labels on one of the affected nodes?

@dimm0
Copy link

dimm0 commented Aug 30, 2023

I did not have the nodeSelector in linstorcluster, only in satellitesets. Adding it to cluster and restarting all controllers seemed to help, thanks!

@online01993
Copy link

I did not have the nodeSelector in linstorcluster, only in satellitesets. Adding it to cluster and restarting all controllers seemed to help, thanks!

@dimm0 can you share config please?

@dimm0
Copy link

dimm0 commented Aug 30, 2023

apiVersion: piraeus.io/v1
kind: LinstorCluster
metadata:
  creationTimestamp: "2023-05-31T23:46:16Z"
  generation: 6
  name: linstorcluster
  resourceVersion: "6323433601"
  uid: d1eac6b0-150c-4068-95fc-f21dbff1dabd
spec:
  nodeSelector:
    nautilus.io/linstor: "true"
  patches:
  - patch: |-
      apiVersion: apps/v1
      kind: DaemonSet
      metadata:
        name: linstor-csi-node
      spec:
        template:
          spec:
            nodeSelector:
              nautilus.io/linstor: "true"
            tolerations:
            - effect: PreferNoSchedule
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/gpu-bad
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/large-gpu
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/testing
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/sdsu-fix
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/stashcache
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/sdsu-fix
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/nrp-testing
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/haosu
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/ceph
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/science-dmz
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/noceph
              operator: Exists
            - effect: NoSchedule
              key: drbd.linbit.com/lost-quorum
            - effect: NoSchedule
              key: drbd.linbit.com/force-io-error
    target:
      kind: DaemonSet
      name: linstor-csi-node
  - patch: |-
      apiVersion: apps/v1
      kind: DaemonSet
      metadata:
        name: ha-controller
      spec:
        template:
          spec:
            nodeSelector:
              nautilus.io/linstor: "true"
            tolerations:
            - effect: PreferNoSchedule
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/gpu-bad
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/large-gpu
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/testing
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/sdsu-fix
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/stashcache
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/nrp-testing
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/haosu
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/ceph
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/science-dmz
              operator: Exists
            - effect: NoSchedule
              key: nautilus.io/noceph
              operator: Exists
            - effect: NoSchedule
              key: drbd.linbit.com/lost-quorum
            - effect: NoSchedule
              key: drbd.linbit.com/force-io-error
    target:
      kind: DaemonSet
      name: ha-controller
status:
  conditions:
  - lastTransitionTime: "2023-08-30T04:57:04Z"
    message: Resources applied
    observedGeneration: 6
    reason: AsExpected
    status: "True"
    type: Applied
  - lastTransitionTime: "2023-08-30T04:40:58Z"
    message: 'Controller 1.24.1 (API: 1.20.1, Git: f7d71e7c5416c84d6cdb05f9b490d9b1e81622e2)
      reachable at ''http://linstor-controller.piraeus-datastore.svc:3370'''
    observedGeneration: 6
    reason: AsExpected
    status: "True"
    type: Available
  - lastTransitionTime: "2023-08-30T05:42:29Z"
    message: Properties applied
    observedGeneration: 6
    reason: AsExpected
    status: "True"
    type: Configured

@dimm0
Copy link

dimm0 commented Aug 30, 2023

The nodes have nautilus.io/linstor: "true" label set

@online01993
Copy link

The nodes have nautilus.io/linstor: "true" label set

@dimm0 can you do it with nodes which does not have "node-role.kubernetes.io/control-plane" label?

@dimm0
Copy link

dimm0 commented Aug 30, 2023

Do you want to exclude the master?
I don't think you can do it... nodeSelector doesn't have negative select.
Basically you have to label all nodes except that one. Not too convenient...

@online01993
Copy link

Do you want to exclude the master? I don't think you can do it... nodeSelector doesn't have negative select. Basically you have to label all nodes except that one. Not too convenient...

Yes, that's what I want. I want to exclude all masters, but I don't understand how to do it

@dimm0
Copy link

dimm0 commented Aug 30, 2023

Not possible with the current selector.
If it was supporting nodeAffinity, you'd be able to exclude..

@WanzenBug
Copy link
Member Author

If it was supporting nodeAffinity, you'd be able to exclude..

Please open a feature request, should not be too hard to implement :)

I did not have the nodeSelector in linstorcluster, only in satellitesets. Adding it to cluster and restarting all controllers seemed to help, thanks!

Ok, so now you do not have any unexpected satellites anymore?

@dimm0
Copy link

dimm0 commented Aug 30, 2023

Please open a feature request, should not be too hard to implement :)

Will do!

Ok, so now you do not have any unexpected satellites anymore?

Finally!!! :)

@online01993
Copy link

Please open a feature request, should not be too hard to implement :)

Will do!
It`s done!

@dimm0
Copy link

dimm0 commented Aug 30, 2023

BTW, I don't fully understand how it currently creates the nodes.
If it can't schedule a pod for some reason (it stays pending), but it matches nodeSelector, it will be a node in linstor.
Does that mean you implement the label selection yourself?
Won't it be better to get the list of running pods from kubernetes and use those?

I'm just thinking if you add the nodeAffinity, you'll have to rely on k8s anyways.

@WanzenBug
Copy link
Member Author

Yes, it's currently a (bad) reimplementation of the Kubernetes scheduler.

The reason is: we need to support every node having a slightly different Pod spec for the satellite. That is because the Pod spec for a satellite might be different based on:

  • The host OS (different DRBD loader).
  • Different storage pool configuration (The file pools may need different host path volumes).
  • Perhaps different overrides, such as user-applied patches.

All of these are features we think are useful, but makes it hard to use a normal DaemonSet. So we need to use raw pods instead. Perhaps we can somehow reuse the logic of the kube-controller-manager when creating the DaemonSet Pods, I need to look into that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request v2 This affects only Operator v2
Projects
None yet
Development

No branches or pull requests

5 participants