Ack Frequency Draft-11 #613

MarcoPolo · 2025-08-31T05:33:36Z

This change updates the existing ack frequency code to be up to date with draft-11 of the spec.

The commits are meaningful, and I recommend reviewing this PR commit by commit.

The last commit contains some next steps/questions as TODOs. I wanted to sync up on that first before continuing. We may not want to include this commit in the final merge.

A couple other questions:

Do we want to limit the max reordering threshold we accept?

We have a fixed size of ack ranges we’ll track, and very large reordering thresholds can introduce many gaps. The issue with many gaps is that we could drop a range before acking it. However, I don’t think that’s actually an issue as we have the QUICLY_NUM_ACK_BLOCKS_TO_INDUCE_ACKACK logic that should report our acks when we have too many gaps.

Is there anything else we should do when handling an immediate ack besides setting send_ack_at?

I’m not sure there’s a security issue here since a peer could get the same high rate of ack behavior by setting the Ack-Eliciting threshold to 0 or by introducing gaps in packet numbers.

What should we advertise for QUICLY_TRANSPORT_PARAMETER_ID_MIN_ACK_DELAY?

It must be at least our timer’s resolution (1ms?). But should it be higher? Do we need to be mindful of server load in what we decide here?

The first two commits fix some warnings. Happy to pull them out into a separate PR, just let me know.

kazuho

Thank you for the PR. Looks great!

lib/quicly.c

kazuho · 2025-09-08T05:05:39Z

lib/quicly.c

+
+                    /* Question: Do we want to move scheduling/sending of ack frequency frames out of this function? */
+                    /* TODO: register an ack callback for the ack frequency frame.
+                     * We don't have a way to handle the ack of the ack_frequency frame */


Yeah, considering that ACK_FREQUENCY is an optimization and that it is used only when the connection is in a stable state, maybe the current logic of sending an updated value every 4 PTO is fine?

It might be possible to detect the loss of an ACK_FREQUENCY frame and reset conn->egress.ack_frequency.update_at, but the question is if it is worth doing.

Agreed it's probably not worth doing just to reset conn->egress.ack_frequency.update_at.

I'm more concerned about the logic around the Probe Timeout Period https://datatracker.ietf.org/doc/html/draft-ietf-quic-ack-frequency-11#section-7. Specifically this part:

Until the packet carrying the ACK_FREQUENCY frame is acknowledged, the endpoint MUST use the greater of the current max_ack_delay and the value that is in flight when computing the PTO period. Doing so avoids spurious PTOs that can be caused by an update that decreases the peer's max_ack_delay.

The calculated max_ack_delay is essentially the max of the current max_ack_delay and all in flight max_ack_delays. If we could handle the ack of this frame we would be able to update the current state and reduce the list of in flight states.

For now it's not necessary as we don't change this value.

lib/quicly.c

kazuho · 2025-09-08T06:12:24Z

lib/quicly.c


+    // TODO: use received frame.max_ack_delay. We currently use a constant (25 ms) and
+    // ignore the value set our our transport parameter (see max_ack_delay field comment).
+    // Is there a reason for this?


It is because I am lazy; we do not care much about performance acting as a receiver compared to being a sender.

That said, it is better to have both sides implemented efficiently, so improvements here would be appreciated, either in this PR or elsewhere.

Happy to tackle this in a separate PR

lib/quicly.c

lib/ranges.c

kazuho

I think this will do?

t/test.c

Lets the caller find the next missing packet number in quicly ranges

MarcoPolo · 2025-09-16T03:00:23Z

I think this will do?

I think so! I just pushed two changes (dropping the "fix openssl warning" commit and adding the "always maintain smallest_unreported_missing"). Easier to view with git range-diff as well.

git range-diff master 205f7be6b192 288cc27c2ae6

MarcoPolo · 2025-09-16T03:04:22Z

and I removed the questions from the TODOs commit, but feel free to drop that commit as well.

kazuho

Thank you for all the efforts (including answering my stupid questions)! I think this is getting really good.

I could be missing something, but how do we follow the following statement in Section 6.2: When an ack-eliciting packet is received with a packet number less than Largest Acked, this still triggers an immediate acknowledgement in an effort to avoid the packet being spuriously declared lost?

I wonder if we need to track largest_acked. And assuming that is the case we might not need to retain smallest_unreported_missing as a state, because we can calculate smallest_unreported_missing using other states?

lib/quicly.c

MarcoPolo · 2025-09-16T21:03:07Z

I could be missing something, but how do we follow the following statement in Section 6.2: When an ack-eliciting packet is received with a packet number less than Largest Acked, this still triggers an immediate acknowledgement in an effort to avoid the packet being spuriously declared lost?

I didn’t follow that text. I instead followed the text from https://github.com/quicwg/ack-frequency/pull/322/files. But with a semantically equivalent change. To inline the relevant parts:

An endpoint that receives an ACK_FREQUENCY frame with a non-zero Reordering Threshold value SHOULD send an immediate ACK whenever it receives an ack-eliciting, out-of order packet whose packet number is outside the reordering window of the peer, i.e. when

the difference between the smallest Unreported Missing packet and the Largest Unacked packet is greater than or equal to the Reordering Threshold value; or

the received packet number is less than or equal to Largest Acked - Reordering Threshold.

And to quote the relevant parts of the code:

uint64_t largest_pn_outside_reorder_window = largest_unacked - (uint64_t)reordering_threshold;
// ...
received_pn <= largest_pn_outside_reorder_window

Also note our definition of largest_unacked:

/**
 * largest unacked ack-eliciting packet number.
 * After sending an ack this remains the same and represents the last
 * largest unacked ack-eliciting packet number.
 */
uint64_t largest_unacked;

Instead of comparing Largest Acked - Reordering Threshold, I’m comparing Largest Unacked - Reordering Threshold. But this is equivalent. The rough proof is:

largest_acked <= largest_unacked (by definition)
if received_pn <= largest_acked-reordering_threshold, then received_pn <= largest_unacked-reordering_threshold.

Therefore we will always send an immediate ack if the received pn is outside the reordering window relative to largest ack.

There is still the question if comparing against largest_unacked introduces an immediate ack that wouldn’t have happened if we compared against largest_acked. This would only happen if received_pn > largest_acked-reordering_threshold and received_pn <= largest_unacked-reordering_threshold. However, note that if the latter case is true, then it implies that the received_pn was missing when we received largest_unacked, which means that we would have sent an immediate ack (as largest_unacked - received_pn >= reordering_threshold) due to the first condition in the quoted spec (no extra immediate acks).

Does that make sense?

I wonder if we need to track largest_acked. And assuming that is the case we might not need to retain smallest_unreported_missing as a state, because we can calculate smallest_unreported_missing using other states?

We could do instead track largest_acked instead of smallest_unreported_missing, but I think it would lead to calling quicly_ranges_next_missing (and thus traversing the ranges) more often, as you’d need to call that method anytime you want to find the smallest_unreported_missing packet number.

The reason I went with the current design is that we can avoid calling quicly_ranges_next_missing in the following cases:

Receiving packets in order.
The received packet is not the smallest_unreported_missing and the smallest_unreported_missing is still within the reordering window.

kazuho · 2025-09-17T05:48:24Z

@MarcoPolo Thank you always for sharing the insights behind the well-thought design.

I instead followed the text from https://github.com/quicwg/ack-frequency/pull/322/files.

I see, makes perfect sense.

Instead of comparing Largest Acked - Reordering Threshold, I’m comparing Largest Unacked - Reordering Threshold. But this is equivalent.

I think this is mostly true. When the packet(s) with largest PNs being received are non-ack-eliciting, I think we might not send ACKs when the spec requires us to. But senders would not expect ACKs triggered by non-ack-eliciting packets to drive loss recovery, so maybe that's fine?

I know you have added tests to verify the behavior, I will look into them closely and add some to convince myself fully.

We could do instead track largest_acked instead of smallest_unreported_missing, but I think it would lead to calling quicly_ranges_next_missing (and thus traversing the ranges) more often, as you’d need to call that method anytime you want to find the smallest_unreported_missing packet number.

I agree that this is the concern.

…THRESHOLD

max_ack_delay = min_rtt * ctx->ack_frequency / 1024;

Previously we just used the largest seen packet number for calculating the reordering window. Instead, we should only use the largest seen ack-eliciting packet number.

This lets us avoid having to calculate a new value when receiving a ack_frequency_frame.

This happens when the largest seen packet number is non-ack-eliciting, and we have unacked ack-eliciting packet numbers.

MarcoPolo · 2025-09-17T18:14:21Z

@MarcoPolo Thank you always for sharing the insights behind the well-thought design.

My pleasure! Thank you for the patience here and sharing your context. I’ve learned a lot.

Instead of comparing Largest Acked - Reordering Threshold, I’m comparing Largest Unacked - Reordering Threshold. But this is equivalent.

I think this is mostly true. When the packet(s) with largest PNs being received are non-ack-eliciting, I think we might not send ACKs when the spec requires us to. But senders would not expect ACKs triggered by non-ack-eliciting packets to drive loss recovery, so maybe that’s fine?

Right. I think my logic is sound, but my code doesn’t uphold this invariant:

largest_acked <= largest_unacked (by definition)

There are cases where the largest_acked is larger than the largest unacked ack-eliciting packet number.

For example, assume:

we received ack-eliciting packets [0,1,3]
our reordering threshold is 2
ack eliciting threshold is 10.

If we then receive non-ack-eliciting PN 4 and QUICLY_DELAYED_ACK_TIMEOUT time passes, then we’ll send an ack with the largest acked being 4 and 2 can be declared lost.

This can be fixed by storing instead the max of largest acked or largest unacked ack eliciting packet number. I did this in 7608cd7, and added a test case for this.

I know you have added tests to verify the behavior, I will look into them closely and add some to convince myself fully.

I’ve extended the tests a bit to add the ability to represent non-ack-eliciting packets and the time interval between packets.

We could do instead track largest_acked instead of smallest_unreported_missing, but I think it would lead to calling quicly_ranges_next_missing (and thus traversing the ranges) more often, as you’d need to call that method anytime you want to find the smallest_unreported_missing packet number.

I agree that this is the concern.

Thinking about this more with the context of non-ack-eliciting packets, there is another tradeoff. If we store largest_acked and largest_unacked, then we wouldn’t need to calculate smallest_unreported_missing when receiving a non-ack-eliciting packet and thus could skip calling quicly_ranges_next_missing. If the application tends to handle many non-ack-eliciting packets (possibly the case with quicly on the server?), then it may be more efficient to store the largest_acked+largest_unacked. What do you think?

As a reminder, in the current design we need to find the next smallest_unreported_missing if the received PN is the smallest_unreported_missing regardless if the PN is ack-eliciting or not.

… loss detection, and it is uncertain what the optimal value is

…eordering_threshold is zero

MarcoPolo

I've reviewed the recent changes. They look good (sorry for the formatting and missing the conventions).

MarcoPolo · 2025-10-20T18:15:13Z

lib/quicly.c

+                     * passed to quicly_loss_detect_loss too. */
+                    uint64_t max_ack_delay = conn->super.remote.transport_params.max_ack_delay * 1000;
+                    uint64_t reordering_threshold =
+                        conn->egress.loss.thresholds.use_packet_based ? QUICLY_LOSS_DEFAULT_PACKET_THRESHOLD : 0;


Just a callout - prior to this PR, setting use_packet_based = 0 would still trigger an ack on reordering. This changes that behavior, but I think it makes sense because semantically disabling packet based behavior should also mean ignoring reordering. Perhaps the reason we didn’t do this before is because RFC 9000 didn’t allow it (?)

Thank you for taking a close look at the changes.

Perhaps the reason we didn’t do this before is because RFC 9000 didn’t allow it (?)

I checked just now, and this was exactly the case! This call site was introduced by #319, and as the quicly_encode_ack_frequency_frame function in that PR indicates, the ignore_order flag was still missing in the specification. The encoding was updated in #491, but we failed to adjust this call site.

sorry for the formatting and missing the conventions

Not at all! It's a matter of taste of the maintainers.

kazuho reviewed Sep 9, 2025

View reviewed changes

MarcoPolo force-pushed the ack_frequency branch from 4512f0f to d54d3dc Compare September 11, 2025 03:45

fix type warning

642cb33

MarcoPolo force-pushed the ack_frequency branch from d54d3dc to 205f7be Compare September 13, 2025 04:54

kazuho reviewed Sep 16, 2025

View reviewed changes

t/test.c Outdated Show resolved Hide resolved

t/test.c Show resolved Hide resolved

MarcoPolo added 5 commits September 15, 2025 19:43

update Ack Frequency transport parameter

7d85b20

handle updated Ack frequency frame

0284dd6

remove min_ack_delay limit check from older Ack Frequency draft

f1574ae

handle Immediate Ack frame

634c4e4

add quicly_ranges_next_missing

52418cf

Lets the caller find the next missing packet number in quicly ranges

MarcoPolo force-pushed the ack_frequency branch from 205f7be to 288cc27 Compare September 16, 2025 02:55

MarcoPolo force-pushed the ack_frequency branch from 288cc27 to 6775e4e Compare September 16, 2025 03:03

kazuho reviewed Sep 16, 2025

View reviewed changes

lib/quicly.c Outdated Show resolved Hide resolved

MarcoPolo force-pushed the ack_frequency branch from 6775e4e to c9a1951 Compare September 16, 2025 03:30

MarcoPolo added 12 commits September 17, 2025 09:45

ack on changes outside of reordering window

e0b87bb

add tests for ack frequency reordering and threshold

8e62a41

set ack frequency reordering threshold to QUICLY_LOSS_DEFAULT_PACKET_…

eafc90f

…THRESHOLD

use immediate acks for prompt acks

dbc23bc

only bundle ack_frequency if other ack is ack eliciting

5a9c07e

check space before bundling ack_freq frame

3e3c81f

ack on transition from !CE -> CE

aca6d8f

simple ack frequency handling test

b32dee0

dump ack_frequency sent stats

57650d1

accept a 0 ack-eliciting threshold

b95853e

derive max_ack_delay for ack_frequency frame

deaa4e5

max_ack_delay = min_rtt * ctx->ack_frequency / 1024;

only ack-eliciting packets can count for largest_unacked

c133a0d

Previously we just used the largest seen packet number for calculating the reordering window. Instead, we should only use the largest seen ack-eliciting packet number.

MarcoPolo added 3 commits September 17, 2025 09:46

always maintain smallest_unreported_missing

356efdb

This lets us avoid having to calculate a new value when receiving a ack_frequency_frame.

TODOs

dcd1322

handle case where a non-ack-eliciting packet becomes the largest acked.

7608cd7

This happens when the largest seen packet number is non-ack-eliciting, and we have unacked ack-eliciting packet numbers.

MarcoPolo force-pushed the ack_frequency branch from c9a1951 to 7608cd7 Compare September 17, 2025 18:09

kazuho added 8 commits October 16, 2025 15:34

Merge branch 'master' into ack_frequency

8a084ee

use decimal, in the ascending order

95a42d3

define maximum size like other frames

953761c

ACK_FREQNENCY.max_delay should better be aligned to the value used in…

ffa1fa0

… loss detection, and it is uncertain what the optimal value is

when PN-based detection is disabled, correct value of ACK_FREQUENCY.r…

74115e7

…eordering_threshold is zero

maybe these empty lines do not indicate value groups

d2e2264

clang-format

6a884b3

maybe it's easier to read this way

44a94c4

MarcoPolo commented Oct 20, 2025

View reviewed changes

Ack Frequency Draft-11 #613

Are you sure you want to change the base?

Ack Frequency Draft-11 #613

Uh oh!

Conversation

MarcoPolo commented Aug 31, 2025

Uh oh!

kazuho left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kazuho Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

MarcoPolo Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kazuho Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

MarcoPolo Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kazuho left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

MarcoPolo commented Sep 16, 2025

Uh oh!

MarcoPolo commented Sep 16, 2025

Uh oh!

kazuho left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MarcoPolo commented Sep 16, 2025

Uh oh!

kazuho commented Sep 17, 2025

Uh oh!

MarcoPolo commented Sep 17, 2025

Uh oh!

MarcoPolo left a comment

Choose a reason for hiding this comment

Uh oh!

MarcoPolo Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

kazuho Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants