Skip to content

Conversation

@qiliRedHat
Copy link

@qiliRedHat qiliRedHat commented Dec 9, 2025

https://issues.redhat.com/browse/OCPNODE-3939

dynamic-system-reserved-cpu.sh is used to calculate the SYSTEM_RESERVED_CPU for AutoSizingReserved
Test

./dynamic-system-reserved-cpu.sh 96
SYSTEM_RESERVED_CPU=1.20

stress-slice-single-node.sh is to stress a given node's given slice with specified cpu cores for a period of time
Test

./stress-slice-single-node.sh ip-xx-xx-xx-xx.us-east-2.compute.internal 600 system.slice:4,kubepods.slice:7                                    
[INFO] =========================================
[INFO] Systemd Slice CPU Stress Test (Host-level)
[INFO] =========================================
[INFO] Target Node: ip-xx-xx-xx-xx.us-east-2.compute.internal
[INFO] Duration: 600s
[INFO] Target Slices:
[INFO]   - system.slice: 4 cores
[INFO]   - kubepods.slice: 7 cores
[INFO] =========================================
[INFO] Gathering node information...
[INFO] Node CPU Info:
  Total Capacity: 4 cores
  Allocatable: 3500m cores
  System Reserved: 0.50 cores (500m)
[INFO] Starting stress test on node: ip-xx-xx-xx-xx.us-east-2.compute.internal
[WARN] Using 'oc debug node' to run systemd-run commands directly on the host
=========================================
Starting multi-slice stress test

Launching stress processes...

Starting 4 processes in system.slice...
  Started process 1 in system.slice
  Started process 2 in system.slice
  Started process 3 in system.slice
  Started process 4 in system.slice

Starting 7 processes in kubepods.slice...
  Started process 1 in kubepods.slice
  Started process 2 in kubepods.slice
  Started process 3 in kubepods.slice
  Started process 4 in kubepods.slice
  Started process 5 in kubepods.slice
  Started process 6 in kubepods.slice
  Started process 7 in kubepods.slice

Removing debug pod ...
Running as unit: stress-test-kubepods-slice-1.service
  stress-test-kubepods-slice-1.service loaded active running CPU Stress Test for kubepods.slice Process 1
  stress-test-kubepods-slice-2.service loaded active running CPU Stress Test for kubepods.slice Process 2
  stress-test-kubepods-slice-3.service loaded active running CPU Stress Test for kubepods.slice Process 3
  stress-test-kubepods-slice-4.service loaded active running CPU Stress Test for kubepods.slice Process 4
  stress-test-kubepods-slice-5.service loaded active running CPU Stress Test for kubepods.slice Process 5
  stress-test-kubepods-slice-6.service loaded active running CPU Stress Test for kubepods.slice Process 6
  stress-test-kubepods-slice-7.service loaded active running CPU Stress Test for kubepods.slice Process 7
  stress-test-system-slice-1.service   loaded active running CPU Stress Test for system.slice Process 1
  stress-test-system-slice-2.service   loaded active running CPU Stress Test for system.slice Process 2
  stress-test-system-slice-3.service   loaded active running CPU Stress Test for system.slice Process 3
  stress-test-system-slice-4.service   loaded active running CPU Stress Test for system.slice Process 4

CPU-intensive processes running. Will run for 600 seconds...

[INFO] 
[INFO] =========================================
[INFO] Stress test running for 600 seconds
[INFO] Monitor the output above for node behavior
[INFO] Stress processes are running in the following slices on the HOST:
[INFO]   - system.slice: 4 cores
[INFO]   - kubepods.slice: 7 cores
[INFO] =========================================
[INFO] 
[INFO] Starting background monitoring...
[INFO] Monitoring output will be saved to: stress_test_log_iip-xx-xx-xx-xx.us-east-2.compute.internal_20251209_140745.log
[Time remaining]: 600s

==================== Tue Dec  9 14:07:45 CST 2025 ====================

--- Node Resource Usage ---
NAME                                        CPU(cores)   CPU(%)   MEMORY(bytes)   MEMORY(%)   
ip-xx-xx-xx-xx.us-east-2.compute.internal   3993m        114%     1948Mi          15%         

--- Pods on Node (Non-Running, excluding Completed) ---
All pods running or none found

--- Stress test processes are running ---
[Time remaining]: 590s

  stress-test-kubepods-slice-1.service loaded active running CPU Stress Test for kubepods.slice Process 1
  stress-test-kubepods-slice-2.service loaded active running CPU Stress Test for kubepods.slice Process 2
  stress-test-kubepods-slice-3.service loaded active running CPU Stress Test for kubepods.slice Process 3
  stress-test-kubepods-slice-4.service loaded active running CPU Stress Test for kubepods.slice Process 4
  stress-test-kubepods-slice-5.service loaded active running CPU Stress Test for kubepods.slice Process 5
  stress-test-kubepods-slice-6.service loaded active running CPU Stress Test for kubepods.slice Process 6
  stress-test-kubepods-slice-7.service loaded active running CPU Stress Test for kubepods.slice Process 7
  stress-test-system-slice-1.service   loaded active running CPU Stress Test for system.slice Process 1
  stress-test-system-slice-2.service   loaded active running CPU Stress Test for system.slice Process 2
  stress-test-system-slice-3.service   loaded active running CPU Stress Test for system.slice Process 3
  stress-test-system-slice-4.service   loaded active running CPU Stress Test for system.slice Process 4

[Time remaining]: 580s

==================== Tue Dec  9 14:08:09 CST 2025 ====================
....
=========================================
Stress test completed!
=========================================
[INFO] 
[INFO] =========================================
[INFO] Stress test completed!
[INFO] =========================================
[INFO] Checking final node status...
NAME                                        STATUS   ROLES    AGE     VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                                                KERNEL-VERSION                 CONTAINER-RUNTIME
ip-xx-xx-xx-xx.us-east-2.compute.internal  Ready    worker   4h45m   v1.34.2   10.0.64.194   <none>        Red Hat Enterprise Linux CoreOS 9.6.20251205-0 (Plow)   5.14.0-570.73.1.el9_6.x86_64   cri-o://1.34.2-2.rhaos4.21.gitc8e8b46.el9
[INFO] 
[INFO] Checking for any evicted or failed pods...
[INFO] No evicted/failed pods found
[INFO] 
[INFO] Recent events on node:
....
[INFO] 
[INFO] =========================================
[INFO] Test Summary
[INFO] =========================================
[INFO] Node: ip-xx-xx-xx-xx.us-east-2.compute.internal
[INFO] Stress Duration: 600s
[INFO] Slices stressed:
[INFO]   - system.slice: 4 cores
[INFO]   - kubepods.slice: 7 cores
[INFO] Total CPU cores stressed: 11
[INFO] Log file: stress_test_log_ip-xx-xx-xx-xx.us-east-2.compute.internal_*.log
[INFO] 
[INFO]   chroot /host journalctl -u stress-test-* --since '10 minutes ago'
[INFO] 
[INFO] Prometheus queries to check (example for first slice):
[INFO]   rate(container_cpu_usage_seconds_total{id="/system.slice", node="ip-xx-xx-xx-xx.us-east-2.compute.internal"}[1m])* 1000
[INFO] =========================================

@openshift-ci-robot
Copy link

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 9, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: qiliRedHat
Once this PR has been reviewed and has the lgtm label, please assign stbenjam for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot requested review from deads2k and sjenning December 9, 2025 07:16
@openshift-ci-robot
Copy link

Scheduling required tests:
/test e2e-aws-csi
/test e2e-aws-ovn-fips
/test e2e-aws-ovn-microshift
/test e2e-aws-ovn-microshift-serial
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-gcp-csi
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upgrade
/test e2e-metal-ipi-ovn-ipv6
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-upi

@qiliRedHat
Copy link
Author

/test e2e-gcp-ovn

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 9, 2025

@qiliRedHat: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants