fix(s2n-quic-transport): Fixes s2n-quic TLS state incongruity when s2n-tls returns pending frequently#2574
Merged
maddeleine merged 18 commits intomainfrom Apr 23, 2025
Merged
fix(s2n-quic-transport): Fixes s2n-quic TLS state incongruity when s2n-tls returns pending frequently#2574maddeleine merged 18 commits intomainfrom
maddeleine merged 18 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull Request Overview
This pull request addresses issue #2484 by ensuring that the handshake space is correctly discarded when a TLS handshake completes during unexpected wakeup events.
- Added logic in on_wakeup() to discard the handshake space when confirmed
- Inline documentation referencing RFC 9001 for context
Comments suppressed due to low confidence (1)
quic/s2n-quic-transport/src/connection/connection_impl.rs:1142
- [nitpick] Consider adding a unit test that simulates multiple wakeups with pending TLS responses to verify that the handshake space is discarded as expected.
if self.space_manager.handshake().is_some() && self.space_manager.is_handshake_confirmed() {
camshaft
reviewed
Apr 7, 2025
| .expect("Server should exist") | ||
| .new_server_session(transport_parameters); | ||
| SlowSession { | ||
| defer: 10, |
Contributor
There was a problem hiding this comment.
Why are we picking 10? I think even 1 is probably good enough for most cases. Maybe we should take this as a parameter?
Contributor
Author
There was a problem hiding this comment.
How do you feel about 3. I feel like one is too small. Not sure what the purpose of making this configurable would be.
This was referenced Apr 7, 2025
camshaft
approved these changes
Apr 22, 2025
boquan-fang
pushed a commit
to boquan-fang/s2n-quic
that referenced
this pull request
Apr 23, 2025
…n-tls returns pending frequently (aws#2574)
dougch
pushed a commit
that referenced
this pull request
May 19, 2025
…n-tls returns pending frequently (#2574)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Release Summary:
Resolved issues:
resolves #2484
Description of changes:
Basically #2484 hit a debug assert failure where the TLS implementation had confirmed the handshake, but the s2n-quic space manager had not yet discarded its handshake space. In a normal handshake, the function handle_handshake_packet() calls update_crypto_state(), which eventually completes the handshake. Then the function on_processed packet() discards the handshake space.
However, in a weird handshake with a bunch of wakeups and Pendings returned from the TLS implementation, the handshake completes during the call to update_crypto_state() from the on_wakeup() function. And in that case the handshake space doesn't get discarded.
To fix this, I added code to discard the handshake space in the on_wakeup() function.
Call-outs:
An alternative approach that works is to move the code to discard the handshake space from on_processed_packet() to update_crypto_state(). I don't have strong intuition on which approach is better.
Testing:
Added a new integ test that reproduces the issue. Therefore, we can assert that this fix does solve the issue.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.