Skip to content

Commit

Permalink
Optimize counter polling interval by making it more accurate (#1457) (#…
Browse files Browse the repository at this point in the history
…1534)

What I did

Optimize the counter-polling performance in terms of polling interval accuracy

Enable bulk counter-polling to run at a smaller chunk size
There is one counter-polling thread for each counter group. All such threads can compete for the critical sections at the vendor SAI level, which means a counter-polling thread can wait for a critical section if another thread has been in it, which introduces latency for the waiting counter group.
An example is the competition between the PFC watchdog and the port counter groups.
The port counter group contains many counters and is polled in a bulk mode which takes a relatively longer time. The PFC watchdog counter group contains only a few counters but is polled at a short interval. Sometimes, PFC watchdog counters need to wait before polling, which makes the polling interval inaccurate and prevents the PFC storm from being detected in time.
To resolve this issue, we can reduce the chunk size of the port counter group. The port counter group polls the counters of all ports in a single bulk operation by default. By using a smaller chunk size, it polls the counters in several bulk operations with each polling counter of a subset (whose size <= chunk size) of all ports.
By doing so, the port counter group stays in the critical section for a shorter time and the PFC watchdog is more likely to be scheduled to poll counters and detect the PFC storm in time.

Collect the time stamp immediately after vendor SAI API returns.
Currently, many counter groups require a Lua plugin to execute based on polling interval, to calculate rates, detect certain events, etc.
Eg. For PFC watchdog counter group to PFC storm. In this case, the polling interval is calculated based on the difference of time stamps between the current and last poll to avoid deviation due to scheduling latency. However, the timestamp is collected in the Lua plugin which is several steps after the SAI API returns and is executed in a different context (redis-server). Both introduce even larger deviations. To overcome this, we collect the timestamp immediately after the SAI API returns.
  • Loading branch information
stephenxs authored Feb 26, 2025
1 parent d884ff9 commit 3df03e1
Show file tree
Hide file tree
Showing 5 changed files with 664 additions and 50 deletions.
2 changes: 2 additions & 0 deletions lib/RedisRemoteSaiInterface.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -568,6 +568,8 @@ sai_status_t RedisRemoteSaiInterface::notifyCounterGroupOperations(
std::string key((const char*)flexCounterGroupParam->counter_group_name.list, flexCounterGroupParam->counter_group_name.count);

emplaceStrings(POLL_INTERVAL_FIELD, flexCounterGroupParam->poll_interval, entries);
emplaceStrings(BULK_CHUNK_SIZE_FIELD, flexCounterGroupParam->bulk_chunk_size, entries);
emplaceStrings(BULK_CHUNK_SIZE_PER_PREFIX_FIELD, flexCounterGroupParam->bulk_chunk_size_per_prefix, entries);
emplaceStrings(STATS_MODE_FIELD, flexCounterGroupParam->stats_mode, entries);
emplaceStrings(flexCounterGroupParam->plugin_name, flexCounterGroupParam->plugins, entries);
emplaceStrings(FLEX_COUNTER_STATUS_FIELD, flexCounterGroupParam->operation, entries);
Expand Down
Loading

0 comments on commit 3df03e1

Please sign in to comment.