No byteswarning #1290

Kriechi · 2024-11-23T12:15:29Z

super-seeds and closes #1286, #1236

@BYK please review. I combined most of your previous commits from #1286, so the only code change should be in my most recent commit on this PR.

Fixes python-hyper#1236. This patch makes all header operations operate on `bytes` and converts all headers and values to bytes before operation. With a follow up patch to `hpack` it should also increase efficiency as currently, `hpack` casts everything to a `str` first before converting back to bytes: https://github.com/python-hyper/hpack/blob/02afcab28ca56eb5259904fd414baa89e9f50266/src/hpack/hpack.py#L150-L151

BYK

Don't fully agree with the .startswith() changes and removal of list comprehension in favor of incremental list building. That said I'm not gonna block them as fixing the bytes warning is more important.

If you agree with my performance concerns, you can revert those changes back to my version before merging.

Finally, thank you so much for pushing this through!

BYK · 2024-11-23T20:49:13Z

src/h2/utilities.py


-        if n and n[0] != SIGIL:
+        if not n.startswith(b':'):


Why this change? (And the other .startswith() below from my n and n[0] == SIGIL version?) I'm quite sure they are significantly slower as they involve a function call and are not optimized for 1-byte look up.

BYK · 2024-11-23T20:50:05Z

src/h2/utilities.py

+    encoded_headers = []
+    for header in headers:
+        h = (_to_bytes(header[0]), _to_bytes(header[1]))
+        if isinstance(header, HeaderTuple):
+            encoded_headers.append(header.__class__(h[0], h[1]))
+        else:
+            encoded_headers.append(h)
+    return encoded_headers


This is again, quite less efficient compared to using a list-comprehension directly.

BYK · 2024-11-23T20:51:33Z

src/h2/utilities.py

-    in the block, and so that it can stop looking when it finds the first
-    header field whose name does not begin with a colon.
+    are well formed and encoded as bytes: that is, that the HTTP/2 special
+    headers are first in the block, and so that it can stop looking when it


Suggested change

headers are first in the block, and so that it can stop looking when it

headers are first in the list, and so that it can stop looking when it

BYK · 2024-11-23T20:54:16Z

src/h2/utilities.py

@@ -345,30 +323,26 @@ def _reject_pseudo_header_fields(headers, hdr_validation_flags):
    method = None

    for header in headers:
-        if _custom_startswith(header[0], b':', u':'):
+        if header[0][0] == SIGIL:


If we want to go down the .startswith() path, we should align this line with that too. This also probably means the top-level SIGIL and INFORMATIONAL_START become obsolete.

BYK · 2024-12-04T13:37:14Z

Shall we merge this? I'm happy to follow up with the issues I raised myself :)

Kriechi · 2024-12-18T20:45:12Z

Yes, let's merge this so we can iterate an smaller changes going forward.
Thanks for your contribution!

Kriechi · 2024-12-18T20:46:48Z

For the performance discussion, I would really like to see a proper benchmark comparing v4.1.0 as released, against latest main branch - ideally across all supported Python versions to rule out any weird optimization edge cases.

BYK and others added 3 commits November 23, 2024 11:44

add more upstream tests

3924557

refactor headers bytes encoding

5c6a7f5

BYK approved these changes Nov 23, 2024

View reviewed changes

Kriechi merged commit 5f129a0 into python-hyper:master Dec 18, 2024
8 checks passed

Kriechi deleted the no-byteswarning branch December 18, 2024 20:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

No byteswarning #1290

No byteswarning #1290

Uh oh!

Kriechi commented Nov 23, 2024

Uh oh!

BYK left a comment

Uh oh!

BYK Nov 23, 2024

Uh oh!

BYK Nov 23, 2024

Uh oh!

BYK Nov 23, 2024

Uh oh!

BYK Nov 23, 2024

Uh oh!

BYK commented Dec 4, 2024

Uh oh!

Uh oh!

Kriechi commented Dec 18, 2024

Uh oh!

Kriechi commented Dec 18, 2024

Uh oh!

Uh oh!

	headers are first in the block, and so that it can stop looking when it
	headers are first in the list, and so that it can stop looking when it

No byteswarning #1290

No byteswarning #1290

Uh oh!

Conversation

Kriechi commented Nov 23, 2024

Uh oh!

BYK left a comment

Choose a reason for hiding this comment

Uh oh!

BYK Nov 23, 2024

Choose a reason for hiding this comment

Uh oh!

BYK Nov 23, 2024

Choose a reason for hiding this comment

Uh oh!

BYK Nov 23, 2024

Choose a reason for hiding this comment

Uh oh!

BYK Nov 23, 2024

Choose a reason for hiding this comment

Uh oh!

BYK commented Dec 4, 2024

Uh oh!

Uh oh!

Kriechi commented Dec 18, 2024

Uh oh!

Kriechi commented Dec 18, 2024

Uh oh!

Uh oh!