Skip to content
This repository has been archived by the owner on Oct 22, 2024. It is now read-only.

Commit

Permalink
deploy: document NFD usage and test with it
Browse files Browse the repository at this point in the history
We should encourage the usage of NFD for labeling nodes because it is
simpler and more scalable. To ensure that it works, we set up the test
cluster accordingly.

The default node selector is the same as before, to avoid breaking
installations during an upgrade.
  • Loading branch information
pohly committed Nov 13, 2020
1 parent 1ae1775 commit 3d63bf1
Show file tree
Hide file tree
Showing 9 changed files with 69 additions and 12 deletions.
2 changes: 0 additions & 2 deletions deploy/common/pmem-app-block-volume.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,6 @@ spec:
volumeMounts:
- name: data
mountPath: /data
nodeSelector:
storage: pmem
volumes:
- name: my-csi-device
persistentVolumeClaim:
Expand Down
5 changes: 4 additions & 1 deletion deploy/common/pmem-csi.intel.com_v1alpha1_deployment_cr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,8 @@ metadata:
spec:
deviceMode: "lvm"
nodeSelector:
storage: "pmem"
# When using Node Feature Discovery (NFD):
feature.node.kubernetes.io/memory-nv.dax: "true"
# When using manual node labeling with that label:
# storage: pmem

7 changes: 5 additions & 2 deletions docs/autotest.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,11 @@ virtual machines.
The first node is the Kubernetes master without
persistent memory.
The other three nodes are worker nodes with one emulated 32GB NVDIMM each.
After the cluster has been formed, `make start` adds `storage=pmem` label
to the worker nodes and deploys the PMEM-CSI driver.
After the cluster has been formed, `make start` installs [NFD](https://kubernetes-sigs.github.io/node-feature-discovery/stable/get-started/index.html) to label
the worker nodes. The PMEM-CSI driver can be installed with
`test/setup-deployment.sh`, but will also be installed as needed by
the E2E test suite.

Once `make start` completes, the cluster is ready for interactive use via
`kubectl` inside the virtual machine. Alternatively, you can also
set `KUBECONFIG` as shown at the end of the `make start` output
Expand Down
31 changes: 24 additions & 7 deletions docs/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -209,10 +209,25 @@ on the Kubernetes version.

- **Label the cluster nodes that provide persistent memory device(s)**

PMEM-CSI manages PMEM on those nodes that have a certain label. For
historic reasons, the default in the YAML files and the operator
settings is to use a label `storage` with the value `pmem`.

Such a label can be set for each node manually with:

``` console
$ kubectl label node <your node> storage=pmem
```

Alternatively, the [Node Feature
Discovery (NFD)](https://kubernetes-sigs.github.io/node-feature-discovery/stable/get-started/index.html)
add-on can be used to label nodes automatically. In that case, the
default PMEM-CSI node selector has to be changed to
`"feature.node.kubernetes.io/memory-nv.dax": "true"`. The operator has
the [`nodeSelector`
field](https://kubernetes-sigs.github.io/node-feature-discovery/stable/get-started/index.html)
for that. For the YAML files a kustomize patch can be used.

### Install PMEM-CSI driver

PMEM-CSI driver can be deployed to a Kubernetes cluster either using the
Expand Down Expand Up @@ -267,12 +282,12 @@ metadata:
spec:
deviceMode: lvm
nodeSelector:
storage: pmem
feature.node.kubernetes.io/memory-nv.dax: "true"
EOF
```

This uses the same `pmem-csi.intel.com` driver name as the YAML files
in [`deploy`](/deploy) and the node label from the [hardware
in [`deploy`](/deploy) and the node label created by NFD (see the [hardware
installation and setup section](#installation-and-setup).

Once the above deployment installation is successful, we can see all the driver
Expand Down Expand Up @@ -448,13 +463,16 @@ verify that the node labels have been configured correctly
$ kubectl get nodes --show-labels
```

The command output must indicate that every node with PMEM has these two labels:
The command output must indicate that every node with PMEM has at least two labels:
``` console
pmem-csi.intel.com/node=<NODE-NAME>,storage=pmem
```

If **storage=pmem** is missing, label manually as described above. If
**pmem-csi.intel.com/node** is missing, then double-check that the
**storage=pmem** is the label that has to be added manually as
described above. When using NFD, the node should have the
`feature.node.kubernetes.io/memory-nv.dax=true` label.

If **pmem-csi.intel.com/node** is missing, then double-check that the
alpha feature gates are enabled, that the CSI driver is running on the node,
and that the driver's log output doesn't contain errors.

Expand Down Expand Up @@ -486,8 +504,7 @@ pmem-csi-pvc-xfs Bound pvc-f7101fd2-6b36-11e9-bf09-deadbeef0100 4Gi
$ kubectl create -f deploy/common/pmem-app.yaml
```

These applications use **storage: pmem** in the <i>nodeSelector</i>
list to ensure scheduling to a node supporting pmem device, and each requests a mount of a volume,
These applications each request a mount of a volume,
one with ext4-format and another with xfs-format file system.

- **Verify two application pods reach 'Running' status**
Expand Down
4 changes: 4 additions & 0 deletions test/e2e/deploy/deploy.go
Original file line number Diff line number Diff line change
Expand Up @@ -932,6 +932,10 @@ func (d *Deployment) GetDriverDeployment() api.Deployment {
// PMEM must be used for LVM, otherwise other tests cannot
// run after the LVM driver was deployed once.
PMEMPercentage: 50,
NodeSelector: map[string]string{
// Provided by NFD.
"feature.node.kubernetes.io/memory-nv.dax": "true",
},
},
}
}
Expand Down
4 changes: 4 additions & 0 deletions test/e2e/operator/deployment_api.go
Original file line number Diff line number Diff line change
Expand Up @@ -385,6 +385,10 @@ var _ = deploy.DescribeForSome("API", func(d *deploy.Deployment) bool {
Spec: api.DeploymentSpec{
DeviceMode: from,
PMEMPercentage: 50,
NodeSelector: map[string]string{
// Provided by NFD.
"feature.node.kubernetes.io/memory-nv.dax": "true",
},
},
}

Expand Down
16 changes: 16 additions & 0 deletions test/setup-deployment.sh
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,22 @@ EOF
value: "--pmemPercentage=50"
EOF
fi

# Always use the configured label for selecting nodes.
${SSH} "cat >>'$tmpdir/my-deployment/kustomization.yaml'" <<EOF
- target:
group: apps
version: v1
kind: DaemonSet
name: pmem-csi-node
path: node-label-patch.yaml
EOF
${SSH} "cat >>'$tmpdir/my-deployment/node-label-patch.yaml'" <<EOF
- op: add
path: /spec/template/spec/nodeSelector
value:
{$(echo "${TEST_PMEM_NODE_LABEL}" | sed -e 's/\(.*\)=\(.*\)/\1: "\2"/')}
EOF
;;
scheduler)
# Change port number via JSON patch.
Expand Down
8 changes: 8 additions & 0 deletions test/start-kubernetes.sh
Original file line number Diff line number Diff line change
Expand Up @@ -404,11 +404,19 @@ function init_kubernetes_cluster() (
scp $SSH_ARGS ${CLOUD_USER}@${master_ip}:.kube/config $KUBECONFIG || die "failed to copy Kubernetes config file"
export KUBECONFIG=${KUBECONFIG}

# Install NFD and let it label all nodes with "feature.node.kubernetes.io/memory-nv.dax: true".
NFD_VERSION=v0.6.0
ssh $SSH_ARGS ${CLOUD_USER}@${master_ip} kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/node-feature-discovery/$NFD_VERSION/nfd-master.yaml.template
ssh $SSH_ARGS ${CLOUD_USER}@${master_ip} kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/node-feature-discovery/$NFD_VERSION/nfd-worker-daemonset.yaml.template

#get kubernetes join token
join_token=$(ssh $SSH_ARGS ${CLOUD_USER}@${master_ip} "$ENV_VARS kubeadm token create --print-join-command") || die "could not get kubeadm join token"
pids=""
for ip in ${workers_ip}; do

# storage=pmem is set *only* for version skew testing and PMEM-CSI deployments < 0.9.0.
# Those still need that label. "kubectl label" can be removed once we stop testing
# against such old release.
vm_name=$(govm list -f '{{select (filterRegexp . "IP" "'${ip}'") "Name"}}') || die "could not find VM name for $ip"
log_name=${CLUSTER_DIRECTORY}/${vm_name}.log
( ssh $SSH_ARGS ${CLOUD_USER}@${ip} "set -x; $ENV_VARS sudo ${join_token/kubeadm/kubeadm --ignore-preflight-errors=SystemVerification}" &&
Expand Down
4 changes: 4 additions & 0 deletions test/test-config.sh
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,10 @@ fi
# match the directory suffix, i.e. start with a hyphen.
: ${TEST_KUBERNETES_FLAVOR:=}

# The label and its value that identifies the nodes with PMEM.
# The default is the label set by NFD.
: ${TEST_PMEM_NODE_LABEL:=feature.node.kubernetes.io/memory-nv.dax=true}

# Kubernetes feature gates to enable/disable.
# EndpointSlice is disabled because of https://github.com/kubernetes/kubernetes/issues/91287 (Kubernetes
# < 1.19) and because there were random connection failures to node ports during sanity
Expand Down

0 comments on commit 3d63bf1

Please sign in to comment.