Operator to allow to expose the per-NUMA-zone compute resources, using the RTE - resource topology exporter. The operator also takes care of deploying the Node Resource Topology API on which the resource topology exporter depends to provide the data. The operator provides minimal support to deploy secondary schedulers.
The currently recommended way of deploying the operator in your cluster is using OLM. OLM greatly simplifies webhook management, which the operator requires. Assuming you can push container images to a container registry and you are in the root directory of this project, a deployment flow can look like:
- fix environment variables as per your need. You will most likely need to override
VERSION
REPO
CONTAINER_ENGINE
- build and upload the operator container image:
make container-build container-push
- build and upload the manifest bundle container image:
make bundle bundle-build bundle-push
- leverage
operator-sdk
to deploy the container:operator-sdk run bundle ${REPO}/numaresources-operator-bundle:${VERSION}
. Note the build procedure typically downloads a local copy ofoperator-sdk
inbin/
which you can reuse
For further details, please refer to the operator-sdk documentation
Please check the issues section for the known issues and limitations of the NUMA resources operator.
NRT objects only take into consideration exclusively allocated CPUs while accounting. In order for a pod to be allocated exclusive CPUs, it HAS to belong to Guaranteed QoS class (request=limit) and request has to be integral. Therefore, CPUs in the shared pool because of pods belonging to best effort/burstable QoS or guaranteed pod with non-integral CPU request would not be accounted for in the NRT objects. Please refer to CPU Manager docs here for more detail on this.
In addition to this, PodResource API is used to extract the resource information from Kubelet for resource accounting. CPUs exposed by the List endpoint of Podresource API correspond to exclusive CPUs allocated to a particular container. CPUs that belong to the shared pool are therefore not exposed by this API.
The NUMA resources operator comes with a growing e2e suite to validate components of the stack (operator proper, RTE) as well as the NUMA aware scheduling as a whole. Pre-built container images including the suites are available. There is no support for these e2e tests images, and they are recommended to be used only for development/CI purposes.
See README.tests.md
for detailed instructions about how to run the suite.
See tests/e2e/serial/README.md
for fine details about the suite and developer instructions.