-
Notifications
You must be signed in to change notification settings - Fork 148
netdev CI testing #6666
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
kuba-moo
wants to merge
682
commits into
kernel-patches:bpf-next_base
Choose a base branch
from
linux-netdev:to-test
base: bpf-next_base
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
netdev CI testing #6666
+27,007
−9,180
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
4f22ee0 to
8a9a8e0
Compare
64c403f to
8da1f58
Compare
78ebb17 to
9325308
Compare
c8c7b2f to
a71aae6
Compare
9325308 to
7940ae1
Compare
d8feb00 to
b16a6b9
Compare
7940ae1 to
8f1ff3c
Compare
4164329 to
c5cecb3
Compare
Instead of translating to/from driver specific flags for packet time stamp control use the common flags directly. This simplifies the driver as the translating code can be removed while at the same time making it clear the flags are not flags written to hardware registers. One thing to note is that the bit-wise and check in rswitch_rx() of RCAR_GEN4_RXTSTAMP_TYPE_V2_L2_EVENT is replaced with a not set check of HWTSTAMP_FILTER_NONE. This is okay as the bit of device specific event replaced was set for all modes except HWTSTAMP_FILTER_NONE. Signed-off-by: Niklas Söderlund <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Instead of translating to/from driver specific flags for packet time stamp control use the common flags directly. This simplifies the driver as the translating code can be removed while at the same time making it clear the flags are not flags written to hardware registers. One thing to note is that the bit-wise and check in rtsn_rx() of RCAR_GEN4_RXTSTAMP_TYPE_V2_L2_EVENT is replaced with a not set check of HWTSTAMP_FILTER_NONE. This is okay as the bit of device specific event replaced was set for all modes except HWTSTAMP_FILTER_NONE. Signed-off-by: Niklas Söderlund <[email protected]> Signed-off-by: NipaLocal <nipa@local>
The driver specific flags to control packet time stamps have all been replaced by values from enum hwtstamp_tx_types and enum hwtstamp_rx_filters. Remove the driver specific flags as there are no more users. Signed-off-by: Niklas Söderlund <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Prepare for moving away from device specific bit-fields to track how to do hardware Rx timestamping to using net common enums by breaking out the timestamping to a helper function. This is done to create cleaner code and prepare for easier changes improving the hardware timestapming. There is no functional change. Signed-off-by: Niklas Söderlund <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Instead of translating to/from driver specific flags for packet time stamp control use the common flags directly. This simplifies the driver as the translating code can be removed while at the same time making it clear the flags are not flags written to hardware registers. The change from a device specific bit-field track variable to the common enum datatypes forces us to touch the ravb_rx_rcar_hwstamp() in a non trivial way. To make this cleaner and easier to understand expand the nested conditions. Signed-off-by: Niklas Söderlund <[email protected]> Signed-off-by: NipaLocal <nipa@local>
ptp_clock should never be registered unless it stubs one of gettimex64() or gettime64() and settime64(). WARN_ON_ONCE and error out if either set of function pointers is null. For consistency, n_alarm validation is also folded into the WARN_ON_ONCE. Suggested-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Reviewed-by: Vadim Fedorenko <[email protected]> Signed-off-by: Tim Hostetler <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Add missing entries in C attribute list. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Based on past discussions it seems like integration of YNL into iproute2 is unlikely. YNL itself is not great as a C library, since it has no backward compat (we routinely change types). Most of the operations can be performed with the generic Python CLI directly. There is, however, a handful of operations where summarization of kernel output is very useful (mostly related to stats: page-pool, qstat). Create a command (inspired by bpftool, I think it stood the test of time reasonably well) to be able to plug the subcommands into. Link: https://lore.kernel.org/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Replace the page-pool sample with page pool support in ynltool.
# ynltool page-pool stats
eth0[2] page pools: 18 (zombies: 0)
refs: 171456 bytes: 702283776 (refs: 0 bytes: 0)
recycling: 97.3% (alloc: 2679:6134966 recycle: 1250981:4719386)
# ynltool -j page-pool stats | jq
[
{
"ifname": "eth0",
"ifindex": 2,
"page_pools": 18,
"zombies": 0,
"live": {
"refs": 171456,
"bytes": 702283776
},
"zombie": {
"refs": 0,
"bytes": 0
},
"recycling_pct": 97.2746,
"alloc": {
"slow": 2679,
"fast": 6135029
},
"recycle": {
"ring": 1250997,
"cache": 4719432
}
}
]
# ynltool page-pool stats group-by pp
pool id: 108 dev: eth0[2] napi: 530
inflight: 9472 pages 38797312 bytes
recycling: 95.5% (alloc: 148:208379 recycle: 45386:153842)
pool id: 107 dev: eth0[2] napi: 529
inflight: 9408 pages 38535168 bytes
recycling: 94.9% (alloc: 147:180178 recycle: 42251:128808)
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
$ ynltool qstat
eth0 rx-packets: 493192163 rx-bytes: 1442544543997
tx-packets: 745999838 tx-bytes: 4574215826482
tx-stop: 7033 tx-wake: 7033
$ ynltool qstat show group-by queue
eth0 rx-0 packets: 70196880 bytes: 178633973750
eth0 rx-1 packets: 63623419 bytes: 197274745250
...
eth0 tx-1 packets: 98645810 bytes: 631247647938
stop: 1048 wake: 1048
eth0 tx-2 packets: 86775824 bytes: 563930471952
stop: 1126 wake: 1126
...
$ ynltool -j qstat | jq
[
{
"ifname": "eth0",
"ifindex": 2,
"rx": {
"packets": 493396439,
"bytes": 1443608198921
},
"tx": {
"packets": 746239978,
"bytes": 4574333772645,
"stop": 7072,
"wake": 7072
}
}
]
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
The main if not only use case for per-queue stats today is checking
for traffic imbalance. Add simple traffic balance analysis to qstats.
$ ynltool qstat balance
eth0 rx 44 queues:
rx-packets : cv=6.9% ns=24.2% stddev=512006493
min=6278921110 max=8011570575 mean=7437054644
rx-bytes : cv=6.9% ns=24.1% stddev=759670503060
min=9326315769440 max=11884393670786 mean=11035439201354
...
$ ynltool -j qstat balance | jq
[
{
"ifname": "eth0",
"ifindex": 2,
"queue-type": "rx",
"rx-packets": {
"queue-count": 44,
"min": 6278301665,
"max": 8010780185,
"mean": 7.43635E+9,
"stddev": 5.12012E+8,
"coefficient-of-variation": 6.88525,
"normalized-spread": 24.249
},
...
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
A soft lockup issue was observed with following log: watchdog: BUG: soft lockup - CPU#1 stuck for 104s! [tc:94] CPU: 1 UID: 0 PID: 94 Comm: tc Tainted: G L 6.18.0-rc4-00007-gc9cfc122f037 kernel-patches#425 PREEMPT(voluntary) RIP: 0010:qdisc_match_from_root+0x0/0x70 Call Trace: <TASK> qdisc_tree_reduce_backlog+0xec/0x110 fq_change+0x2e0/0x6a0 qdisc_create+0x138/0x4e0 tc_modify_qdisc+0x5b9/0x9d0 rtnetlink_rcv_msg+0x15a/0x400 netlink_rcv_skb+0x55/0x100 netlink_unicast+0x257/0x380 netlink_sendmsg+0x1e2/0x420 ____sys_sendmsg+0x313/0x340 ___sys_sendmsg+0x82/0xd0 __sys_sendmsg+0x6c/0xd0 </TASK> The issue can be reproduced by: tc qdisc add dev eth0 root noqueue tc qdisc add dev eth0 handle ffff:0 ingress tc qdisc add dev eth0 handle ffe0:0 parent ffff:a fq A fq qdisc was created in __tc_modify_qdisc(), when the input parent major 'ffff' is equal to the ingress qdisc handle major and the complete parent handle is not TC_H_ROOT or TC_H_INGRESS, which leads to a infinite loop in qdisc_tree_reduce_backlog(). The infinite loop scenario: qdisc_tree_reduce_backlog // init sch is fq qdisc, parent is ffff000a while ((parentid = sch->parent)) { // query qdisc by handle ffff0000 sch = qdisc_lookup_rcu(qdisc_dev(sch), TC_H_MAJ(parentid)); // return ingress qdisc, sch->parent is fffffff1 if (sch == NULL) { ... } Commit 2e95c43 ("net/sched: stop qdisc_tree_reduce_backlog on TC_H_ROOT") break the loop only when parent TC_H_ROOT is reached. However, qdisc_lookup_rcu() will return the same qdisc when input an ingress qdisc with major 'ffff'. If the parent TC_H_INGRESS is reached, also break the loop. Fixes: 2e95c43 ("net/sched: stop qdisc_tree_reduce_backlog on TC_H_ROOT") Signed-off-by: Wang Liang <[email protected]> Signed-off-by: NipaLocal <nipa@local>
The functions txgbe_up() and txgbe_down() are called in pairs to reset hardware configurations. PTP stop function is not called in txgbe_down(), so there is no need to call PTP init function in txgbe_up(). Fixes: 06e7516 ("net: wangxun: Add support for PTP clock") Cc: [email protected] Signed-off-by: Jiawen Wu <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Alex will send phylink patches soon which will make us link up on QEMU again, but for now let's hack up the link. Gives us a chance to add another QEMU NIC test to "HW" runners in the CI. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Let's see if this increases stability of timing-related results.. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Signed-off-by: NipaLocal <nipa@local>
Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>
These are unlikely to matter for CI testing and they slow things down. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>
tc_actions.sh keeps hanging the forwarding tests. sdf@: tdc & tdc-dbg started intermittenly failing around Sep 25th Signed-off-by: NipaLocal <nipa@local>
Signed-off-by: NipaLocal <nipa@local>
We exclusively use headless VMs today, don't waste time compiling sound and GPU drivers. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>
kmemleak auto scan could be a source of latency for the tests. We run a full scan after the tests manually, we don't need the autoscan thread to be enabled. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reusable PR for hooking netdev CI to BPF testing.