Skip to content

net: ip: Fix the warning in the data path #93282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

krish2718
Copy link
Contributor

Instead of warning for every-packet, warn only once and let user debug the underlying cause.

Fix #49845 partially.

Instead of warning for every-packet, warn only once and let user debug
the underlying cause.

Fix zephyrproject-rtos#49845 partially.

Signed-off-by: Chaitanya Tata <[email protected]>
Copy link

@@ -262,8 +262,8 @@ static bool net_if_tx(struct net_if *iface, struct net_pkt *pkt)
status = net_if_l2(iface)->send(iface, pkt);
net_if_tx_unlock(iface);
if (status < 0) {
NET_WARN("iface %d pkt %p send failure status %d",
net_if_get_by_iface(iface), pkt, status);
NET_WARN_ONCE("iface %d pkt %p send failure status %d",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the linked issue I can see how warning on every packet is a usability problem (as mentioned in the LOG_WRN_ONCE PR), but I also don't think that only ever outputting a single warning is great either.
The two annoyances I see are:

  1. you have no idea whether its just a transient failure or whether all packets are failing
  2. If you only attach to the logs after the first occurance, you have no idea there is a problem at all

Couldn't the initial issue be resolved at the zperf level by handling packet send errors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two annoyances I see are:

  1. you have no idea whether its just a transient failure or whether all packets are failing
  2. If you only attach to the logs after the first occurance, you have no idea there is a problem at all

I understand that a single print might not help, but do we really want to debug data path issues using prints? IMHO, we should be using statistics to convey the seriousness of the issue. If that is still not acceptable, then I propose we pull in another Linux feature printk_ratelimited which I am still not keen (in favour of printk_once) this way at least we don't bombard and user can control the rate. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can only speak for myself, but if a deployed device is not getting data through to the cloud, I'm much more likely to be looking at serial logs than to sit there polling a stats object (somehow?) and checking to see if an error counter is going up. Even if it is going up, it doesn't really provide any reasoning as to why its going up.

A rate limited output would be fine from my perspective, but is obviously more work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likely to be looking at serial logs than to sit there polling a stats object (somehow?) and checking to see if an error counter is going up.

Well, I almost always use those shell commands to look at drops :). Traffic running async + shell to keep dumping stats is my go to debug for data path issues than looking at a flood of prints.

A rate limited output would be fine from my perspective, but is obviously more work.

Yes, it's a proper feature that needs to be implemented.

Copy link
Contributor

@JordanYates JordanYates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put more succinctly, shouldn't the problem at the driver layer that causes the failures and the application layer that continuously keeps trying to send be fixed, rather than making the log output less useful?

@krish2718
Copy link
Contributor Author

Put more succinctly, shouldn't the problem at the driver layer that causes the failures and the application layer that continuously keeps trying to send be fixed, rather than making the log output less useful?

Absolutely, the entire pipeline as you say is responsible as you say (and IIRC we had the same discussion about lacking stop/start data path in Zephyr), but the specific problem this PR addresses is that, bombarding with prints (Zperf pumping at 50M) doesn't help, esp. you loose any control over the shell, cannot even type in wifi statistics or net stats to debug.

@JordanYates
Copy link
Contributor

Absolutely, the entire pipeline as you say is responsible as you say (and IIRC we had the same discussion about lacking stop/start data path in Zephyr), but the specific problem this PR addresses is that, bombarding with prints (Zperf pumping at 50M) doesn't help, esp. you loose any control over the shell, cannot even type in wifi statistics or net stats to debug.

Can we do something like only printing a warning if at least 1 second has passed since the last warning?

@krish2718
Copy link
Contributor Author

Absolutely, the entire pipeline as you say is responsible as you say (and IIRC we had the same discussion about lacking stop/start data path in Zephyr), but the specific problem this PR addresses is that, bombarding with prints (Zperf pumping at 50M) doesn't help, esp. you loose any control over the shell, cannot even type in wifi statistics or net stats to debug.

Can we do something like only printing a warning if at least 1 second has passed since the last warning?

Yeah, the rate limiting discussion is in the above comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

net: Disable data path prints
5 participants