netdev CI testing #6666

kuba-moo · 2024-03-27T20:02:33Z

Reusable PR for hooking netdev CI to BPF testing.

Instead of translating to/from driver specific flags for packet time stamp control use the common flags directly. This simplifies the driver as the translating code can be removed while at the same time making it clear the flags are not flags written to hardware registers. One thing to note is that the bit-wise and check in rswitch_rx() of RCAR_GEN4_RXTSTAMP_TYPE_V2_L2_EVENT is replaced with a not set check of HWTSTAMP_FILTER_NONE. This is okay as the bit of device specific event replaced was set for all modes except HWTSTAMP_FILTER_NONE. Signed-off-by: Niklas Söderlund <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Instead of translating to/from driver specific flags for packet time stamp control use the common flags directly. This simplifies the driver as the translating code can be removed while at the same time making it clear the flags are not flags written to hardware registers. One thing to note is that the bit-wise and check in rtsn_rx() of RCAR_GEN4_RXTSTAMP_TYPE_V2_L2_EVENT is replaced with a not set check of HWTSTAMP_FILTER_NONE. This is okay as the bit of device specific event replaced was set for all modes except HWTSTAMP_FILTER_NONE. Signed-off-by: Niklas Söderlund <[email protected]> Signed-off-by: NipaLocal <nipa@local>

The driver specific flags to control packet time stamps have all been replaced by values from enum hwtstamp_tx_types and enum hwtstamp_rx_filters. Remove the driver specific flags as there are no more users. Signed-off-by: Niklas Söderlund <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Prepare for moving away from device specific bit-fields to track how to do hardware Rx timestamping to using net common enums by breaking out the timestamping to a helper function. This is done to create cleaner code and prepare for easier changes improving the hardware timestapming. There is no functional change. Signed-off-by: Niklas Söderlund <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Instead of translating to/from driver specific flags for packet time stamp control use the common flags directly. This simplifies the driver as the translating code can be removed while at the same time making it clear the flags are not flags written to hardware registers. The change from a device specific bit-field track variable to the common enum datatypes forces us to touch the ravb_rx_rcar_hwstamp() in a non trivial way. To make this cleaner and easier to understand expand the nested conditions. Signed-off-by: Niklas Söderlund <[email protected]> Signed-off-by: NipaLocal <nipa@local>

ptp_clock should never be registered unless it stubs one of gettimex64() or gettime64() and settime64(). WARN_ON_ONCE and error out if either set of function pointers is null. For consistency, n_alarm validation is also folded into the WARN_ON_ONCE. Suggested-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Reviewed-by: Vadim Fedorenko <[email protected]> Signed-off-by: Tim Hostetler <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Add missing entries in C attribute list. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Based on past discussions it seems like integration of YNL into iproute2 is unlikely. YNL itself is not great as a C library, since it has no backward compat (we routinely change types). Most of the operations can be performed with the generic Python CLI directly. There is, however, a handful of operations where summarization of kernel output is very useful (mostly related to stats: page-pool, qstat). Create a command (inspired by bpftool, I think it stood the test of time reasonably well) to be able to plug the subcommands into. Link: https://lore.kernel.org/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Replace the page-pool sample with page pool support in ynltool. # ynltool page-pool stats eth0[2] page pools: 18 (zombies: 0) refs: 171456 bytes: 702283776 (refs: 0 bytes: 0) recycling: 97.3% (alloc: 2679:6134966 recycle: 1250981:4719386) # ynltool -j page-pool stats | jq [ { "ifname": "eth0", "ifindex": 2, "page_pools": 18, "zombies": 0, "live": { "refs": 171456, "bytes": 702283776 }, "zombie": { "refs": 0, "bytes": 0 }, "recycling_pct": 97.2746, "alloc": { "slow": 2679, "fast": 6135029 }, "recycle": { "ring": 1250997, "cache": 4719432 } } ] # ynltool page-pool stats group-by pp pool id: 108 dev: eth0[2] napi: 530 inflight: 9472 pages 38797312 bytes recycling: 95.5% (alloc: 148:208379 recycle: 45386:153842) pool id: 107 dev: eth0[2] napi: 529 inflight: 9408 pages 38535168 bytes recycling: 94.9% (alloc: 147:180178 recycle: 42251:128808) Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

$ ynltool qstat eth0 rx-packets: 493192163 rx-bytes: 1442544543997 tx-packets: 745999838 tx-bytes: 4574215826482 tx-stop: 7033 tx-wake: 7033 $ ynltool qstat show group-by queue eth0 rx-0 packets: 70196880 bytes: 178633973750 eth0 rx-1 packets: 63623419 bytes: 197274745250 ... eth0 tx-1 packets: 98645810 bytes: 631247647938 stop: 1048 wake: 1048 eth0 tx-2 packets: 86775824 bytes: 563930471952 stop: 1126 wake: 1126 ... $ ynltool -j qstat | jq [ { "ifname": "eth0", "ifindex": 2, "rx": { "packets": 493396439, "bytes": 1443608198921 }, "tx": { "packets": 746239978, "bytes": 4574333772645, "stop": 7072, "wake": 7072 } } ] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

The main if not only use case for per-queue stats today is checking for traffic imbalance. Add simple traffic balance analysis to qstats. $ ynltool qstat balance eth0 rx 44 queues: rx-packets : cv=6.9% ns=24.2% stddev=512006493 min=6278921110 max=8011570575 mean=7437054644 rx-bytes : cv=6.9% ns=24.1% stddev=759670503060 min=9326315769440 max=11884393670786 mean=11035439201354 ... $ ynltool -j qstat balance | jq [ { "ifname": "eth0", "ifindex": 2, "queue-type": "rx", "rx-packets": { "queue-count": 44, "min": 6278301665, "max": 8010780185, "mean": 7.43635E+9, "stddev": 5.12012E+8, "coefficient-of-variation": 6.88525, "normalized-spread": 24.249 }, ... Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

A soft lockup issue was observed with following log: watchdog: BUG: soft lockup - CPU#1 stuck for 104s! [tc:94] CPU: 1 UID: 0 PID: 94 Comm: tc Tainted: G L 6.18.0-rc4-00007-gc9cfc122f037 kernel-patches#425 PREEMPT(voluntary) RIP: 0010:qdisc_match_from_root+0x0/0x70 Call Trace: <TASK> qdisc_tree_reduce_backlog+0xec/0x110 fq_change+0x2e0/0x6a0 qdisc_create+0x138/0x4e0 tc_modify_qdisc+0x5b9/0x9d0 rtnetlink_rcv_msg+0x15a/0x400 netlink_rcv_skb+0x55/0x100 netlink_unicast+0x257/0x380 netlink_sendmsg+0x1e2/0x420 ____sys_sendmsg+0x313/0x340 ___sys_sendmsg+0x82/0xd0 __sys_sendmsg+0x6c/0xd0 </TASK> The issue can be reproduced by: tc qdisc add dev eth0 root noqueue tc qdisc add dev eth0 handle ffff:0 ingress tc qdisc add dev eth0 handle ffe0:0 parent ffff:a fq A fq qdisc was created in __tc_modify_qdisc(), when the input parent major 'ffff' is equal to the ingress qdisc handle major and the complete parent handle is not TC_H_ROOT or TC_H_INGRESS, which leads to a infinite loop in qdisc_tree_reduce_backlog(). The infinite loop scenario: qdisc_tree_reduce_backlog // init sch is fq qdisc, parent is ffff000a while ((parentid = sch->parent)) { // query qdisc by handle ffff0000 sch = qdisc_lookup_rcu(qdisc_dev(sch), TC_H_MAJ(parentid)); // return ingress qdisc, sch->parent is fffffff1 if (sch == NULL) { ... } Commit 2e95c43 ("net/sched: stop qdisc_tree_reduce_backlog on TC_H_ROOT") break the loop only when parent TC_H_ROOT is reached. However, qdisc_lookup_rcu() will return the same qdisc when input an ingress qdisc with major 'ffff'. If the parent TC_H_INGRESS is reached, also break the loop. Fixes: 2e95c43 ("net/sched: stop qdisc_tree_reduce_backlog on TC_H_ROOT") Signed-off-by: Wang Liang <[email protected]> Signed-off-by: NipaLocal <nipa@local>

The functions txgbe_up() and txgbe_down() are called in pairs to reset hardware configurations. PTP stop function is not called in txgbe_down(), so there is no need to call PTP init function in txgbe_up(). Fixes: 06e7516 ("net: wangxun: Add support for PTP clock") Cc: [email protected] Signed-off-by: Jiawen Wu <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Alex will send phylink patches soon which will make us link up on QEMU again, but for now let's hack up the link. Gives us a chance to add another QEMU NIC test to "HW" runners in the CI. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Let's see if this increases stability of timing-related results.. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Signed-off-by: NipaLocal <nipa@local>

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

These are unlikely to matter for CI testing and they slow things down. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

tc_actions.sh keeps hanging the forwarding tests. sdf@: tdc & tdc-dbg started intermittenly failing around Sep 25th Signed-off-by: NipaLocal <nipa@local>

Signed-off-by: NipaLocal <nipa@local>

We exclusively use headless VMs today, don't waste time compiling sound and GPU drivers. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

kmemleak auto scan could be a source of latency for the tests. We run a full scan after the tests manually, we don't need the autoscan thread to be enabled. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

… HEAD

kuba-moo force-pushed the to-test branch from 6bd5e75 to bdd05e2 Compare March 27, 2024 21:49

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch 3 times, most recently from 4f22ee0 to 8a9a8e0 Compare March 28, 2024 04:46

kuba-moo force-pushed the to-test branch 11 times, most recently from 64c403f to 8da1f58 Compare March 29, 2024 00:01

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch 3 times, most recently from 78ebb17 to 9325308 Compare March 29, 2024 02:14

kuba-moo force-pushed the to-test branch 6 times, most recently from c8c7b2f to a71aae6 Compare March 29, 2024 18:01

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from 9325308 to 7940ae1 Compare March 29, 2024 18:12

kuba-moo force-pushed the to-test branch 2 times, most recently from d8feb00 to b16a6b9 Compare March 30, 2024 00:01

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from 7940ae1 to 8f1ff3c Compare March 30, 2024 00:21

kuba-moo force-pushed the to-test branch 2 times, most recently from 4164329 to c5cecb3 Compare March 30, 2024 06:00

Niklas Söderlund and others added 27 commits November 4, 2025 19:01

netlink: specs: netdev add missing stats to qstat-get

6c43705

Add missing entries in C attribute list. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: disable random kunit tests

2b88ae4

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: disable 6.17's merge window kunit tests

5c78803

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: config: x86: use periodic HZ tick

5a3dba7

Let's see if this increases stability of timing-related results.. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: profile (time) test output

7c48481

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: timestamp - try waking

1e46f8b

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: dbg: tests: bonding: print info on failure

3196707

Signed-off-by: NipaLocal <nipa@local>

nipa: selftests: net: enable profiling

b887e42

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: tc_action dbg

9fecb7e

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: config: disable CPU_MITIGATIONS

ffbf60a

These are unlikely to matter for CI testing and they slow things down. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: forwarding: set timeout to 3 hours

8826826

tc_actions.sh keeps hanging the forwarding tests. sdf@: tdc & tdc-dbg started intermittenly failing around Sep 25th Signed-off-by: NipaLocal <nipa@local>

nipa: drv: net: add timeout

0969596

Signed-off-by: NipaLocal <nipa@local>

nipa: config: x86: disable GPUs and sound

3e5f542

We exclusively use headless VMs today, don't waste time compiling sound and GPU drivers. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: config: disable kmemleak auto scan

1dcba89

kmemleak auto scan could be a source of latency for the tests. We run a full scan after the tests manually, we don't need the autoscan thread to be enabled. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

kuba-moo force-pushed the to-test branch from 57d4a09 to fb1cfd2 Compare November 5, 2025 03:02

Merge remote-tracking branch 'origin/net-next-2025-11-05--03-00' into…

2e15a74

… HEAD

kuba-moo force-pushed the to-test branch from fb1cfd2 to 2e15a74 Compare November 5, 2025 03:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

netdev CI testing #6666

netdev CI testing #6666

kuba-moo commented Mar 27, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

44 participants

netdev CI testing #6666

Are you sure you want to change the base?

netdev CI testing #6666

Conversation

kuba-moo commented Mar 27, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

44 participants