Skip to content

Deduplicate group messages across connections #61810

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

datwelk
Copy link

@datwelk datwelk commented May 6, 2025

Deduplicate SignalR group messages across connections

  • You've read the Contributor Guide and Code of Conduct.
  • You've included unit or integration tests for your change, where applicable.
  • You've included inline docs for your change, where applicable.
  • There's an open issue for the PR that you are making. If you'd like to propose a new feature or change, please open an issue to discuss the change or find an existing issue.

Description

This change modifies DefaultHubLifetimeManager and RedisHubLifetimeManager to deduplicate SignalR group messages across connections. Previously, when a connection A is part of N-groups, and a SignalR message is sent to those N-groups, connection A will receive those messages N times. This changes instead checks which connections need to be addressed for a list of groups, and uniques them.

Fixes #61809

@ghost ghost added the area-signalr Includes: SignalR clients and servers label May 6, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label May 6, 2025
Copy link
Contributor

Thanks for your PR, @@datwelk. Someone from the team will get assigned to your PR shortly and we'll get it reviewed.

@datwelk
Copy link
Author

datwelk commented May 6, 2025

@dotnet-policy-service agree company="LIVEOP B.V."

@datwelk datwelk marked this pull request as ready for review May 6, 2025 09:29
Comment on lines 228 to 230
List<Task>? tasks = new List<Task>();

var connections = new ConcurrentDictionary<string, HubConnectionContext>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know too little about the code in order to tell if there's an alternative approch for the fix.
But I'm bothered about these allocation. So I guess they should be cached an re-used. In a quite simple manner that could be done as something like:

public class Foo
{
    private static List<Task>? s_tasks;
    private static ConcurrentDictionary<string, HubConnectionContext>? s_connections;

    public Task SendAsync()
    {
        List<Task> tasks = Interlocked.Exchange(ref s_tasks, null) ?? [];
        ConcurrentDictionary<string, HubConnectionContext> connections = Interlocked.Exchange(ref s_connections, null) ?? [];

        try
        {
            // Use task and connections

            return Task.WhenAll(tasks);
        }
        finally
        {
            tasks.Clear();
            connections.Clear();

            Interlocked.CompareExchange(ref s_tasks, tasks, null);
            Interlocked.CompareExchange(ref s_connections, connections, null);
        }
    }
}

@datwelk datwelk requested a review from captainsafia as a code owner May 6, 2025 16:50
@datwelk datwelk force-pushed the feature/deduplicate-group-messages-across-connections branch from 29e1aa1 to c393463 Compare May 7, 2025 07:08
Comment on lines +243 to +247
if (connections == null)
{
connections = new HashSet<string>();
}
connections.Add(connection.Key);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI this could be written also as 1-liner.

Suggested change
if (connections == null)
{
connections = new HashSet<string>();
}
connections.Add(connection.Key);
(connections ?? []).Add(connection.Key);

For me it's readability is the same.


In L239 can you also check if the group has items?
If so, the check for connections == null can be removed altogehter, and initialize the HashSet just before the foreach-loop in L241, because if the group has items, then Add is called unconditionally.

The HashSet could then be also initialized with the proper capacity, to avoid potential resizing and copying of the internal structures of HashSet.

@@ -226,7 +226,7 @@ public override Task SendGroupsAsync(IReadOnlyList<string> groupNames, string me
{
// Each task represents the list of tasks for each of the writes within a group
List<Task>? tasks = null;
SerializedHubMessage? message = null;
HashSet<string>? connections = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on how much we want to put into perf / allocations a few thoughts:

  • cache the HashSet? See the previous comment above for an simple approach -- but this is also no silver bullet, as it has (in the long term) an Gen2 -> Gen0 reference
  • HashSet is O(1), but still a bit costly due to hash computation, bucket lookup, etc. So for a small amount of connections a simple array with a linear scan (O(n)) could be faster -- the logic could be encapsulated in a new type "FastHashSet" which starts with the array-approach and switches to the HashSet when more items are present

Well, if that's a goal this can be done in a separate PR for sure.

Comment on lines +256 to +259
if (tasks == null)
{
tasks = new List<Task>();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HashSet has a Count property, so initialize the tasks list to proper size before the loop, thus avoiding this if.

{
foreach (var connection in connectionStore)
{
if (connections == null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can similar optimizations as above be applied here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And to the ToList() call below. I wonder if we shouldn't be using a pooled Span instead.

Copy link
Member

@halter73 halter73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain more details about the design of your app where you're broadcasting a message to multiple groups that a single client might be subscribed to? Might it be better to have these clients subscribe to a separate group meant solely for the previously multi-group broadcasts? Even after this change, that would be more efficient.

This does seem like a nicer behavior than before. I can't think of a reason you'd want a client to receive duplicate messages just because they're in multiple groups the original message was sent to, but I'm not sure it's worth the extra complexity and performance hit if no one has complained since yesterday. However, I'm fine taking it if the perf impact is small.

@BrennanConroy @sebastienros How hard would it be to measure the perf impact of this change?

{
tasks = new List<Task>();
}
tasks.Add(SendConnectionAsync(connectionId, methodName, args, cancellationToken));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call to SendConnectionAsync reserializes the method invocation payload for each connection.

// We're sending to a single connection
// Write message directly to connection without caching it in memory
var message = CreateInvocationMessage(methodName, args);

Prior to this change, SendToGroupConnections would call CreateSerializedInvocationMessage once, send the same serialized invocation payload to each client. I suspect that will have a far bigger performance impact than the HashSet, although I'd also like to pool that if we move forward with this change.

@@ -238,7 +238,26 @@ public override Task SendGroupsAsync(IReadOnlyList<string> groupNames, string me
var group = _groups[groupName];
if (group != null)
{
DefaultHubLifetimeManager<THub>.SendToGroupConnections(methodName, args, group, null, null, ref tasks, ref message, cancellationToken);
foreach (var connection in group)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Suggested change
foreach (var connection in group)
foreach (var (connectionId, _) in group)

{
foreach (var connection in connectionStore)
{
if (connections == null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And to the ToList() call below. I wonder if we shouldn't be using a pooled Span instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-signalr Includes: SignalR clients and servers community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deduplicate SignalR group messages across connections
3 participants