Skip to content

Commit 00913cb

Browse files
kquinn1204aireilly
authored andcommitted
TELCODOCS-1506 Telco CORE Reference Design Specification
ref config updates Adding generated YAML + modules from the RDS Reorganizing TOC Adding latest RDS updates RDS doc updates RDS YAML updates Shanes review latest updates + new components overview diagram David J's review comments CCS attributes update Ian's review comments adding Hari's deviations update updating 4.14 link URLs Generalizes scope and deviation topics for RAN and Core updates for RDS prod version update deviations wording Updating deviation and scope topics consolidate dev and scope topics + intro Adding link to ztp-site-generate procedure typo final changes for RDS typos Ian's comments update for RDS terminology remove core CRs note add GH core CRs link
1 parent 8c545d6 commit 00913cb

File tree

165 files changed

+4374
-46
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

165 files changed

+4374
-46
lines changed

.vale/styles/Vocab/OpenShiftDocs/accept.txt

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,21 +7,24 @@
77
[Mm]idhaul
88
[Pp]assthrough
99
[Pp]ostinstall
10+
[Pp]recaching
1011
[Pp]reinstall
1112
[Rr]ealtime
1213
[Tt]elco
1314
Assisted Installer
1415
Control Plane Machine Set Operator
1516
custom resources?
17+
GHz
1618
gpsd
1719
gpspipe
20+
hyperthreads?
21+
KPIs?
1822
linuxptp
1923
Mbps
24+
MBps
2025
Mellanox
2126
MetalLB
2227
NICs?
23-
Operator
24-
Operators
2528
Operators?
2629
pmc
2730
ubxtool

_attributes/common-attributes.adoc

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -128,6 +128,16 @@ endif::[]
128128
:TempoShortName: distributed tracing platform (Tempo)
129129
:TempoOperator: Tempo Operator
130130
:TempoVersion: 2.3.0
131+
//telco
132+
ifdef::telco-ran[]
133+
:rds: telco RAN DU
134+
:rds-caps: Telco RAN DU
135+
:rds-first: Telco RAN distributed unit (DU)
136+
endif::[]
137+
ifdef::telco-core[]
138+
:rds: telco core
139+
:rds-caps: Telco core
140+
endif::[]
131141
//logging
132142
:logging-title: logging subsystem for Red Hat OpenShift
133143
:logging-title-uc: Logging subsystem for Red Hat OpenShift

_distro_map.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -231,7 +231,7 @@ openshift-dpu:
231231
name: '4.10'
232232
dir: container-platform-dpu/4.10
233233
openshift-telco:
234-
name: OpenShift Container Platform for Telco
234+
name: OpenShift Container Platform
235235
author: OpenShift Documentation Project <[email protected]>
236236
site: commercial
237237
site_name: Documentation

_topic_maps/_topic_map.yml

Lines changed: 25 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2875,10 +2875,32 @@ Name: Reference design specifications
28752875
Dir: telco_ref_design_specs
28762876
Distros: openshift-telco
28772877
Topics:
2878-
- Name: Telco RAN reference design specification
2879-
File: ztp-ran-reference-design
2878+
- Name: Telco reference design specifications
2879+
File: telco-ref-design-specs-overview
2880+
- Name: Telco RAN DU reference design specification
2881+
Dir: ran
2882+
Topics:
2883+
- Name: Telco RAN DU reference design overview
2884+
File: telco-ran-ref-design-spec
2885+
- Name: Telco RAN DU use model overview
2886+
File: telco-ran-du-overview
2887+
- Name: RAN DU reference design components
2888+
File: telco-ran-ref-du-components
2889+
- Name: RAN DU reference design configuration CRs
2890+
File: telco-ran-ref-du-crs
2891+
- Name: Telco RAN DU software specifications
2892+
File: telco-ran-ref-software-artifacts
28802893
- Name: Telco core reference design specification
2881-
File: cnf-core-reference-design
2894+
Dir: core
2895+
Topics:
2896+
- Name: Telco core reference design overview
2897+
File: telco-core-rds-overview
2898+
- Name: Telco core use model overview
2899+
File: telco-core-rds-use-cases
2900+
- Name: Core reference design components
2901+
File: telco-core-ref-design-components
2902+
- Name: Core reference design configuration CRs
2903+
File: telco-core-ref-crs
28822904
---
28832905
Name: Specialized hardware and driver enablement
28842906
Dir: hardware_enablement
92.6 KB
Loading

modules/cnf-deploying-the-numa-aware-scheduler-with-manual-performance-settings.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ metadata:
7878
name: numaresourcesscheduler
7979
spec:
8080
imageSpec: "registry.redhat.io/openshift4/noderesourcetopology-scheduler-container-rhel8:v{product-version}"
81-
cacheResyncPeriod: "5s" <1>
81+
cacheResyncPeriod: "5s" <1>
8282
----
8383
<1> Enter an interval value in seconds for synchronization of the scheduler cache. A value of `5s` is typical for most implementations.
8484
+

modules/cnf-performing-end-to-end-tests-running-cyclictest.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ registry.redhat.io/openshift4/cnf-tests-rhel8:v{product-version} \
3939
/usr/bin/test-run.sh -ginkgo.v -ginkgo.focus="cyclictest"
4040
----
4141
+
42-
The command runs the `cyclictest` tool for 10 minutes (600 seconds). The test runs successfully when the maximum observed latency is lower than `MAXIMUM_LATENCY` (in this example, 20 μs). Latency spikes of 20 μs and above are generally not acceptable for telco RAN workloads.
42+
The command runs the `cyclictest` tool for 10 minutes (600 seconds). The test runs successfully when the maximum observed latency is lower than `MAXIMUM_LATENCY` (in this example, 20 μs). Latency spikes of 20 μs and above are generally not acceptable for {rds} workloads.
4343
+
4444
If the results exceed the latency threshold, the test fails.
4545
+
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * telco_ref_design_specs/ran/telco-ran-ref-design-spec.adoc
4+
5+
:_mod-docs-content-type: CONCEPT
6+
[id="telco-core-whats-new-ref-design_{context}""]
7+
= {product-title} {product-version} features for {rds}
8+
9+
The following features that are included in {product-title} {product-version} and are leveraged by the {rds} reference design specification (RDS) have been added or updated.
10+
11+
.New features for {rds} in {product-title} {product-version}
12+
[cols="1,3", options="header"]
13+
|====
14+
|Feature
15+
|Description
16+
17+
//CNF-7349 Rootless DPDK pods
18+
|Support for running rootless Data Plane Development Kit (DPDK) workloads with kernel access by using the TAP CNI plugin
19+
a|DPDK applications that inject traffic into the kernel can run in non-privileged pods with the help of the TAP CNI plugin.
20+
21+
* link:https://docs.openshift.com/container-platform/4.14/networking/hardware_networks/using-dpdk-and-rdma.html#nw-running-dpdk-rootless-tap_using-dpdk-and-rdma[Using the TAP CNI to run a rootless DPDK workload with kernel access]
22+
23+
//CNF-5977 Better pinning of the networking stack
24+
|Dynamic use of non-reserved CPUs for OVS
25+
a|With this release, the Open vSwitch (OVS) networking stack can dynamically use non-reserved CPUs.
26+
The dynamic use of non-reserved CPUs occurs by default in performance-tuned clusters with a CPU manager policy set to `static`.
27+
The dynamic use of available, non-reserved CPUs maximizes compute resources for OVS and minimizes network latency for workloads during periods of high demand.
28+
OVS cannot use isolated CPUs assigned to containers in `Guaranteed` QoS pods. This separation avoids disruption to critical application workloads.
29+
30+
//CNF-7760
31+
|Enabling more control over the C-states for each pod
32+
a|The `PerformanceProfile` supports `perPodPowerManagement` which provides more control over the C-states for pods. Now, instead of disabling C-states completely, you can specify a maximum latency in microseconds for C-states. You configure this option in the `cpu-c-states.crio.io` annotation, which helps to optimize power savings for high-priority applications by enabling some of the shallower C-states instead of disabling them completely.
33+
34+
* link:https://docs.openshift.com/container-platform/4.14/scalability_and_performance/cnf-low-latency-tuning.html#node-tuning-operator-pod-power-saving-config_cnf-master[Optional: Power saving configurations]
35+
36+
//CNF-7741 Permit to disable NUMA Aware scheduling hints based on SR-IOV VFs
37+
|Exclude SR-IOV network topology for NUMA-aware scheduling
38+
a|You can exclude advertising Non-Uniform Memory Access (NUMA) nodes for the SR-IOV network to the Topology Manager. By not advertising NUMA nodes for the SR-IOV network, you can permit more flexible SR-IOV network deployments during NUMA-aware pod scheduling.
39+
40+
For example, in some scenarios, you want flexibility for how a pod is deployed. By not providing a NUMA node hint to the Topology Manager for the pod's SR-IOV network resource, the Topology Manager can deploy the SR-IOV network resource and the pod CPU and memory resources to different NUMA nodes. In previous {product-title} releases, the Topology Manager attempted to place all resources on the same NUMA node.
41+
42+
* link:https://docs.openshift.com/container-platform/4.14/networking/hardware_networks/configuring-sriov-device.html#nw-sriov-exclude-topology-manager_configuring-sriov-device[Exclude the SR-IOV network topology for NUMA-aware scheduling]
43+
44+
//CNF-8035 MetalLB VRF Egress interface selection with VRFs (Tech Preview)
45+
|Egress service resource to manage egress traffic for pods behind a load balancer (Technology Preview)
46+
a|With this update, you can use an `EgressService` custom resource (CR) to manage egress traffic for pods behind a load balancer service.
47+
48+
You can use the `EgressService` CR to manage egress traffic in the following ways:
49+
50+
* Assign the load balancer service's IP address as the source IP address of egress traffic for pods behind the load balancer service.
51+
52+
* Configure the egress traffic for pods behind a load balancer to a different network than the default node network.
53+
54+
* link:https://docs.openshift.com/container-platform/4.14/networking/ovn_kubernetes_network_provider/configuring-egress-traffic-for-vrf-loadbalancer-services.html#configuring-egress-traffic-loadbalancer-services[Configuring an egress service]
55+
56+
|====
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * telco_ref_design_specs/ran/telco-core-ref-components.adoc
4+
5+
:_mod-docs-content-type: REFERENCE
6+
[id="telco-core-cluster-network-operator_{context}"]
7+
= Cluster Network Operator (CNO)
8+
9+
New in this release::
10+
11+
Not applicable.
12+
13+
Description::
14+
15+
The CNO deploys and manages the cluster network components including the default OVN-Kubernetes network plugin during {product-title} cluster installation. It allows configuring primary interface MTU settings, OVN gateway modes to use node routing tables for pod egress, and additional secondary networks such as MACVLAN.
16+
+
17+
In support of network traffic segregation, multiple network interfaces are configured through the CNO. Traffic steering to these interfaces is configured through static routes applied by using the NMState Operator. To ensure that pod traffic is properly routed, OVN-K is configured with the `routingViaHost` option enabled. This setting uses the kernel routing table and the applied static routes rather than OVN for pod egress traffic.
18+
+
19+
The Whereabouts CNI plugin is used to provide dynamic IPv4 and IPv6 addressing for additional pod network interfaces without the use of a DHCP server.
20+
21+
Limits and requirements::
22+
23+
* OVN-Kubernetes is required for IPv6 support.
24+
* Large MTU cluster support requires connected network equipment to be set to the same or larger value.
25+
26+
Engineering considerations::
27+
* Pod egress traffic is handled by kernel routing table with the `routingViaHost` option. Appropriate static routes must be configured in the host.
28+
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * telco_ref_design_specs/ran/telco-core-ref-components.adoc
4+
5+
:_mod-docs-content-type: REFERENCE
6+
[id="telco-core-cpu-partitioning-performance-tune_{context}"]
7+
= CPU partitioning and performance tuning
8+
9+
New in this release::
10+
11+
Open vSwitch (OVS) is removed from CPU partitioning. OVS manages its cpuset dynamically to automatically adapt to network traffic needs. Users no longer need to reserve additional CPUs for handling high network throughput on the primary container network interface (CNI). There is no impact on the configuration needed to benefit from this change.
12+
13+
Description::
14+
15+
CPU partitioning allows for the separation of sensitive workloads from generic purposes, auxiliary processes, interrupts, and driver work queues to achieve improved performance and latency. The CPUs allocated to those auxiliary processes are referred to as `reserved` in the following sections. In hyperthreaded systems, a CPU is one hyperthread.
16+
+
17+
For more information, see https://docs.openshift.com/container-platform/latest/scalability_and_performance/cnf-low-latency-tuning.html#cnf-cpu-infra-container_cnf-master[Restricting CPUs for infra and application containers].
18+
+
19+
Configure system level performance.
20+
For recommended settings, see link:https://docs.openshift.com/container-platform/latest/scalability_and_performance/ztp_far_edge/ztp-reference-cluster-configuration-for-vdu.html#ztp-du-configuring-host-firmware-requirements_sno-configure-for-vdu[Configuring host firmware for low latency and high performance].
21+
22+
Limits and requirements::
23+
* The operating system needs a certain amount of CPU to perform all the support tasks including kernel networking.
24+
** A system with just user plane networking applications (DPDK) needs at least one Core (2 hyperthreads when enabled) reserved for the operating system and the infrastructure components.
25+
* A system with Hyper-Threading enabled must always put all core sibling threads to the same pool of CPUs.
26+
* The set of reserved and isolated cores must include all CPU cores.
27+
* Core 0 of each NUMA node must be included in the reserved CPU set.
28+
* Isolated cores might be impacted by interrupts. The following annotations must be attached to the pod if guaranteed QoS pods require full use of the CPU:
29+
+
30+
----
31+
cpu-load-balancing.crio.io: "disable"
32+
cpu-quota.crio.io: "disable"
33+
irq-load-balancing.crio.io: "disable"
34+
----
35+
* When per-pod power management is enabled with `PerformanceProfile.workloadHints.perPodPowerManagement` the following annotations must also be attached to the pod if guaranteed QoS pods require full use of the CPU:
36+
+
37+
----
38+
cpu-c-states.crio.io: "disable"
39+
cpu-freq-governor.crio.io: "performance"
40+
----
41+
42+
Engineering considerations::
43+
* The minimum reserved capacity (`systemReserved`) required can be found by following the guidance in link:https://access.redhat.com/solutions/5843241["Which amount of CPU and memory are recommended to reserve for the system in OCP 4 nodes?"]
44+
* The actual required reserved CPU capacity depends on the cluster configuration and workload attributes.
45+
* This reserved CPU value must be rounded up to a full core (2 hyper-thread) alignment.
46+
* Changes to the CPU partitioning will drain and reboot the nodes in the MCP.
47+
* The reserved CPUs reduce the pod density, as the reserved CPUs are removed from the allocatable capacity of the OpenShift node.
48+
* The real-time workload hint should be enabled if the workload is real-time capable.
49+
* Hardware without Interrupt Request (IRQ) affinity support will impact isolated CPUs. To ensure that pods with guaranteed CPU QoS have full use of allocated CPU, all hardware in the server must support IRQ affinity.

0 commit comments

Comments
 (0)