Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HLD document for configurable drop counter monitoring #1912

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

arista-hpandya
Copy link
Contributor

@arista-hpandya arista-hpandya commented Feb 7, 2025

What we did:
Added a persistent drop counter monitoring feature to identify persistent packet drops based on user-defined thresholds.

Why we did it:
The current implementation of drop counters in SONiC only provides visibility into the number of packets dropped. This enhancement introduces a way to identify persistent drops in packets based on a user-defined threshold, which can help with troubleshooting.

Support added:
Configurable drop counter monitoring is now supported on platforms that support both the SAI drop counter API and the query APIs.

CPU Overhead:
Minimal. Based on our testing, there was a nominal increase of the mean CPU utilization by 0.03 percentage points.

Memory Overhead:
Negligible. No changes in memory were observed.

Inspiration
The idea was presented by Arista team in SONiC 2023 Hackathon

Issues Tracked
Fixes #1542

- Add a section for persistent drops
- Add details on how to configure monitoring of persistent drop
- Add a detailed diagram explaining the concept of persistent drop
- Add CLI commands to show and configure drop counter monitors
@mssonicbld
Copy link
Collaborator

/azp run

Copy link

No pipelines are associated with this pull request.

@zhangyanzhao
Copy link
Collaborator

@arista-hpandya can you please add the code PRs to this HLD by referring to #806 ? Thanks.

@zhangyanzhao
Copy link
Collaborator

- Argument: -ict / --incident-count-threshold
- Default: 2

When enabled, the persistent drop counter monitor tracks all configured drop counters. These configurations apply globally to all drop counters.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feature supports monitoring only port level drop counters, this information is not captured in the HLD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Internal drop counter monitoring
4 participants