Skip to content

net: ip: Fix the warning in the data path #93282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions include/zephyr/net/net_core.h
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ extern "C" {
#endif /* CONFIG_THREAD_NAME */
#define NET_ERR(fmt, ...) LOG_ERR(fmt, ##__VA_ARGS__)
#define NET_WARN(fmt, ...) LOG_WRN(fmt, ##__VA_ARGS__)
#define NET_WARN_ONCE(fmt, ...) LOG_WRN_ONCE(fmt, ##__VA_ARGS__)
#define NET_INFO(fmt, ...) LOG_INF(fmt, ##__VA_ARGS__)

#define NET_HEXDUMP_DBG(_data, _length, _str) LOG_HEXDUMP_DBG(_data, _length, _str)
Expand Down
4 changes: 2 additions & 2 deletions subsys/net/ip/net_if.c
Original file line number Diff line number Diff line change
Expand Up @@ -262,8 +262,8 @@ static bool net_if_tx(struct net_if *iface, struct net_pkt *pkt)
status = net_if_l2(iface)->send(iface, pkt);
net_if_tx_unlock(iface);
if (status < 0) {
NET_WARN("iface %d pkt %p send failure status %d",
net_if_get_by_iface(iface), pkt, status);
NET_WARN_ONCE("iface %d pkt %p send failure status %d",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the linked issue I can see how warning on every packet is a usability problem (as mentioned in the LOG_WRN_ONCE PR), but I also don't think that only ever outputting a single warning is great either.
The two annoyances I see are:

  1. you have no idea whether its just a transient failure or whether all packets are failing
  2. If you only attach to the logs after the first occurance, you have no idea there is a problem at all

Couldn't the initial issue be resolved at the zperf level by handling packet send errors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two annoyances I see are:

  1. you have no idea whether its just a transient failure or whether all packets are failing
  2. If you only attach to the logs after the first occurance, you have no idea there is a problem at all

I understand that a single print might not help, but do we really want to debug data path issues using prints? IMHO, we should be using statistics to convey the seriousness of the issue. If that is still not acceptable, then I propose we pull in another Linux feature printk_ratelimited which I am still not keen (in favour of printk_once) this way at least we don't bombard and user can control the rate. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can only speak for myself, but if a deployed device is not getting data through to the cloud, I'm much more likely to be looking at serial logs than to sit there polling a stats object (somehow?) and checking to see if an error counter is going up. Even if it is going up, it doesn't really provide any reasoning as to why its going up.

A rate limited output would be fine from my perspective, but is obviously more work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likely to be looking at serial logs than to sit there polling a stats object (somehow?) and checking to see if an error counter is going up.

Well, I almost always use those shell commands to look at drops :). Traffic running async + shell to keep dumping stats is my go to debug for data path issues than looking at a flood of prints.

A rate limited output would be fine from my perspective, but is obviously more work.

Yes, it's a proper feature that needs to be implemented.

net_if_get_by_iface(iface), pkt, status);
}

if (IS_ENABLED(CONFIG_NET_PKT_TXTIME_STATS) ||
Expand Down