Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT(server, client): Add rolling average for packet statistics #6683

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

Hartmnt
Copy link
Member

@Hartmnt Hartmnt commented Jan 6, 2025

  • Make the rolling time window configurable on the server side
  • Proper commit messages
  • More refactoring where possible
  • Translation commit

image

Closes #5872

@Hartmnt Hartmnt added client server feature-request This issue or PR deals with a new feature labels Jan 6, 2025
src/crypto/CryptState.h Outdated Show resolved Hide resolved
@Hartmnt Hartmnt force-pushed the feat_avg_5min branch 17 times, most recently from d6d30ad to fca9e6f Compare January 9, 2025 23:21
@Hartmnt Hartmnt force-pushed the feat_avg_5min branch 8 times, most recently from 9e1859c to 1ffbd13 Compare January 10, 2025 17:13
@Hartmnt Hartmnt force-pushed the feat_avg_5min branch 5 times, most recently from 6f4b997 to 38c9e27 Compare January 11, 2025 00:07
An AccessibleQGroupBox will try to compose an accessible description
based on the layout of labels within.
If a grid-layout is used, we read every row+column combination and
the corresponding grid cell content.

Previously, the AccessibleQGroup box would read row+column even if
there was not label in the actual cell, leading to unnecessary
text. (e.g. when using a sub headline)

This commit adds a check to skip a row+colum combination when
there is no label in the cell.
@Hartmnt Hartmnt force-pushed the feat_avg_5min branch 2 times, most recently from 545982a to 79b222f Compare January 11, 2025 12:15
Copy link
Member

@Krzmbrzl Krzmbrzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment: I believe the term "moving average" is more common than "rolling statistics", so maybe we should use that instead?

auxiliary_files/mumble-server.ini Outdated Show resolved Hide resolved
src/Mumble.proto Outdated Show resolved Hide resolved
src/Mumble.proto Outdated
Comment on lines 558 to 562
optional uint32 rolling_time = 20;
// Rolling packet statistics for packets received from the client.
optional Stats rolling_from_client = 21;
// Rolling packet statistics for packets sent by the server.
optional Stats rolling_from_server = 22;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether creating a dedicated sub-message that encapsulates all rolling stats related fields might make things a bit clearer (as we wouold no longer have to use the rolling prefix for all fields 🤔

src/crypto/CryptState.h Outdated Show resolved Hide resolved
src/crypto/CryptState.cpp Outdated Show resolved Hide resolved
src/crypto/CryptState.h Show resolved Hide resolved
src/crypto/CryptState.h Outdated Show resolved Hide resolved
src/murmur/Server.cpp Outdated Show resolved Hide resolved
src/murmur/Server.h Outdated Show resolved Hide resolved
src/crypto/CryptState.h Outdated Show resolved Hide resolved
@Krzmbrzl
Copy link
Member

I believe the term "moving average" is more common than "rolling statistics"

What we are computing, isn't really an average though, is it? 🤔
Maybe we don't need a new term for this at all and this should just be how the connection statistic is determined? I mean, is there any use in seeing the total number of packets sent/received? 👀

@Hartmnt
Copy link
Member Author

Hartmnt commented Jan 13, 2025

I mean, is there any use in seeing the total number of packets sent/received? 👀

Yes, to compare the two. On it's own both wouldn't be useful (arguably if I had to choose one I'd probably take total).

I want to see if "now" is different than "usual" for some definition of timeframe

@Krzmbrzl
Copy link
Member

I don't think you need to compare to usual. All you need to know is the current fraction of lost packets, which can be approximated as the fraction of lost packets by total sent packets within the last time (if we were sending packets at an infinite frequency then that time window should be infinitesimally small. In our scenario that means that the time window should be reasonably small).

And this packet loss should ideally be zero. That's the baseline and everything >> 0 is a problem.

@Hartmnt
Copy link
Member Author

Hartmnt commented Jan 13, 2025

I don't think you need to compare to usual. All you need to know is the current fraction of lost packets, which can be approximated as the fraction of lost packets by total sent packets within the last time (if we were sending packets at an infinite frequency then that time window should be infinitesimally small. In our scenario that means that the time window should be reasonably small).

And this packet loss should ideally be zero. That's the baseline and everything >> 0 is a problem.

I can give you first hand experience on this as both the servers I am usually on as well as a few client struggle with connection quality issues:

The reason I want to be able to compare "now" and "usual" is because that can help me determine, if the client has an unstable connection anyway (baseline), or is just now experiencing a problem (burst). The same is somewhat true for the server's connection. The question would be something like "Is the server having a bad day?" if you will :D

This information becomes more useful with more users and the longer users stay on the server. In my case many hours. This wouldn't be possible with just the rolling average, unless I keep an eye on the statistics myself for multiples of the rolling window size.

Also I don't see anything bad in keeping the total except for like 3 rows of space in that table. We basically have to keep some memory for the total stats or something of equal size in server memory anyway to compute the rolling window.

@Krzmbrzl
Copy link
Member

Alright then. You may have your cake 🍰 :D

@Hartmnt
Copy link
Member Author

Hartmnt commented Jan 13, 2025

Alright then. You may have your cake 🍰 :D

If you like, I can reorder the table and put the recent stats first?

Also, what do you think about the default of 5 minutes? I picked that rather arbitrarily. Note: I would not set it too too low, because there could be some time between noticing some audio problems and checking this dialog. So I'd say no shorter than 60 seconds by default. However, I think 5 minutes is probably also reasonable for the average server admin.

@Hartmnt Hartmnt force-pushed the feat_avg_5min branch 2 times, most recently from b846b26 to aab5b55 Compare January 13, 2025 20:08
Previously, only the total packet statistics since the user
had connected were tracked. While a good start, these stats
are not really helping to understand sudden connection problems.

This commit adds additional server-side rolling packet stats
which are updated every few seconds/minutes. These new stats
are sent using additional fields in the UserStats protocol
message and rendered in the UserInformation dialog in the
client.

Closes mumble-voip#5872
This commit adds a new option to mumble_server.ini which
allows the server operator to specify the time window in
which rolling packet stats are collected for each client.
@Krzmbrzl
Copy link
Member

If you like, I can reorder the table and put the recent stats first?

Sounds good. Imo the recent stats are usually more interesting to users.

Also, what do you think about the default of 5 minutes? 

5 minutes seems fine to me. The only thing you might want to check is that if you open the stats after <5 minutes from server connect, we'd want to show the total stats as the recent ones instead of leaving them blank/invisible 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client feature-request This issue or PR deals with a new feature server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Show a rolling average in Connection Information dialog
3 participants