Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

THRIFT-5828: reduce over-allocation in Go binary protocol #3057

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

leitzler
Copy link

@leitzler leitzler commented Oct 25, 2024

Due to a subtle caveat in buffer.Bytes the Go binary protocol method ReadBinary() always over-allocated.

We now allocate the asked size directly up to 10 MiB, and then maintain the malformed message protection by using a 10 MiB buffer for larger asks.

The existing benchmarks show that we reduce allocations for small payloads, and keep the edge case on a reasonable level at about 10MiB.

Ran the existing benchmarks in lib/go/thrift with and without this change:

% go test -run X -bench . -benchmem > <file>

% benchcmp -changed old.txt new.txt
[...]
benchmark                                   old allocs     new allocs     delta
BenchmarkSafeReadBytes/normal-20            5              2              -60.00%
BenchmarkSafeReadBytes/max-askedSize-20     8              3              -62.50%
BenchmarkBinaryBinary_0-20                  4              1              -75.00%
BenchmarkBinaryBinary_1-20                  4              1              -75.00%
BenchmarkBinaryBinary_2-20                  5              2              -60.00%
BenchmarkCompactBinary0-20                  4              1              -75.00%
BenchmarkCompactBinary1-20                  4              1              -75.00%
BenchmarkCompactBinary2-20                  5              2              -60.00%
BenchmarkSerializer/baseline-20             8              5              -37.50%
BenchmarkSerializer/plain-20                20             17             -15.00%
BenchmarkSerializer/pool-20                 8              5              -37.50%

benchmark                                   old bytes     new bytes     delta
BenchmarkSafeReadBytes/normal-20            1656          160           -90.34%
BenchmarkSafeReadBytes/max-askedSize-20     15992         10489910      +65494.73%
BenchmarkBinaryBinary_0-20                  1608          160           -90.05%
BenchmarkBinaryBinary_1-20                  1608          160           -90.05%
BenchmarkBinaryBinary_2-20                  1634          184           -88.74%
BenchmarkCompactBinary0-20                  1608          160           -90.05%
BenchmarkCompactBinary1-20                  1608          160           -90.05%
BenchmarkCompactBinary2-20                  1634          184           -88.74%
BenchmarkSerializer/baseline-20             1000          416           -58.40%
BenchmarkSerializer/plain-20                3640          3056          -16.04%
BenchmarkSerializer/pool-20                 1002          417           -58.38%
  • Did you create an Apache Jira ticket? (Request account here, not required for trivial changes)
  • If a ticket exists: Does your pull request title follow the pattern "THRIFT-NNNN: describe my issue"?
  • Did you squash your changes to a single commit? (not required, but preferred)
  • Did you do your best to avoid breaking changes? If one was needed, did you label the Jira ticket with "Breaking-Change"?
  • If your change does not involve any code, include [skip ci] anywhere in the commit message to free up build resources.

@Jens-G Jens-G added the golang patches related to go language label Oct 25, 2024
@fishy fishy self-requested a review October 28, 2024 15:52
buf := new(bytes.Buffer)
_, err := io.CopyN(buf, trans, int64(size))
return buf.Bytes(), err
const readLimit = 10 * 1024 * 1024
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the stated goal here is to optimize for ordinary cases (with size < readLimit), why not just do io.CopyN with bytes.Buffer when it's the edge case? that way we don't have to allocate for an 10MiB buffer upfront if the size is malformed, and if it's really a very huge payload we just pay the price for a bit more allocations (same as today).

(I would also consider making this limit configurable, but don't feel too strongly either way here)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just io.CopyN [...]?

I have no strong opinions in either direction, that works fine for me! The reason I changed it was that io.CopyN uses more memory than this approach. Looking at 40MB ask from the issue tracker, io.CopyN allocates:

    main_test.go:36: 134217366 B/op	      21 allocs/op - ask: 41943040, data: 41943040

while this approach allocates:

    main_test.go:44: 83886131 B/op	       5 allocs/op - ask: 41943040, data: 41943040

But the downside, as you said, is that we will allocate more than available when we get a malformed message.
Let me know if I should change it to use io.CopyN instead!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so there are 3 different cases:

  1. ordinary case (reasonable size, not malformed)
  2. legit very large payload (large size, not malformed)
  3. malformed payload with large size

and the choice here is between 2 and 3 (we already agreed to optimize for 1).

between 2 and 3 I would prefer to optimize for 3, based on the consequences of the opposite. if we optimize for 2 (your current approach), then whenever the code get a malformed payload they would need to allocate 10MiB up front, that can be a big risk for code running with tight resources so the consequence can be more severe (e.g. they have more risk to crash due to insufficient memory). with the CopyN approach the consequence is just more allocations for legit cases, which is slightly slower and use more memory, but if the code is intended to handle legit very large payloads then we can already assume that it has more resource and those "wasted" memory will have a smaller consequences.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I'll update the PR later today, thanks!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, PTAL

Due to a subtle caveat in buffer.Bytes the Go binary protocol method
ReadBinary() always over-allocated.

We now allocate the asked size directly up to 10 MiB, and then
use io.CopyN to maintain the malformed message protection.

The existing benchmarks show that we reduce allocations for small
payloads.

Ran the existing benchmarks in lib/go/thrift with and without this
change:
% go test -run X -bench . -benchmem > <file>

% benchcmp -changed old.txt new.txt
[...]
benchmark                                   old allocs     new allocs     delta
BenchmarkSafeReadBytes/normal-20            5              2              -60.00%
BenchmarkBinaryBinary_0-20                  4              1              -75.00%
BenchmarkBinaryBinary_1-20                  4              1              -75.00%
BenchmarkBinaryBinary_2-20                  5              2              -60.00%
BenchmarkCompactBinary0-20                  4              1              -75.00%
BenchmarkCompactBinary1-20                  4              1              -75.00%
BenchmarkCompactBinary2-20                  5              2              -60.00%
BenchmarkSerializer/baseline-20             8              5              -37.50%
BenchmarkSerializer/plain-20                20             17             -15.00%
BenchmarkSerializer/pool-20                 8              5              -37.50%

benchmark                                   old bytes     new bytes     delta
BenchmarkSafeReadBytes/normal-20            1656          160           -90.34%
BenchmarkBinaryBinary_0-20                  1608          160           -90.05%
BenchmarkBinaryBinary_1-20                  1608          160           -90.05%
BenchmarkBinaryBinary_2-20                  1634          184           -88.74%
BenchmarkCompactBinary0-20                  1608          160           -90.05%
BenchmarkCompactBinary1-20                  1608          160           -90.05%
BenchmarkCompactBinary2-20                  1634          184           -88.74%
BenchmarkSerializer/baseline-20             1000          416           -58.40%
BenchmarkSerializer/plain-20                3640          3056          -16.04%
BenchmarkSerializer/pool-20                 1002          417           -58.38%
@fishy
Copy link
Member

fishy commented Oct 28, 2024

The current state looks good to me, but I'm tagging @dcelasun, @jpkrohling, @yurishkuro who were involved in the discussion and review of the original issue of https://issues.apache.org/jira/browse/THRIFT-5322 and #2292 to take a look.

My one (minor) concern is the hardcoded 10MiB limit. For malformed size > 10MiB we are falling back to the current approach which protects against over allocation, but if we have a malformed size that's < 10MiB but close, we are still doing the allocation up front. Does anyone use thrift go in a tight environment that will be a problem? I don't know.

And if we do think that's a problem, what should we do to make it configurable? Add it as a new API to TConfiguration? or maybe make it something like "always 1/10 of MaxMessageSize configured via TConfiguration"?

@fishy
Copy link
Member

fishy commented Oct 28, 2024

Actually, reading https://nvd.nist.gov/vuln/detail/CVE-2019-11939 again, I think this implementation is sill vulnerable to a very similar attack: bad actor can construct a very small message with a size of 10MiB-1, and the server will still be tricked to do those large allocations, and spamming those malicious messages can still result in denial of service.

To mitigate against this attack, we need to make the read limit configurable. There are 3 ways to do it:

  1. Add a new config to TConfiguration for this
  2. Just reuse the current MaxMessageSize configurable (which would mean that we can mostly revert THRIFT-5322: Guard against large string/binary lengths in Go #2292 I think?)
  3. Define the read limit as something like 1/10 of configured MaxMessageSize, and that 1/10 can be either hardcoded or configurable (but if it's configurable that's mostly just 1).

@leitzler
Copy link
Author

Thanks for investigating. I agree, in the THRIFT-5828 I wrote:

Ideally we would use TConfiguration that was added after PR2292 (in PR #2296), there is already a default max message size at 100MB for example. The max message size is already checked before reading so ReadBinary could be changed back to always allocating the full size.

But to be more specific I'd prefer option 1 over the others. Reason being that the relation to MaxMessageSize looks arbitrary to me (apart from the fact that MaxMessageSize will block it first if that value is lower).

@dcelasun
Copy link
Member

+1 for a new entry in TConfiguration.

@leitzler
Copy link
Author

WDYT @jpkrohling, @yurishkuro?

@yurishkuro
Copy link

+1 to expose a config parameter for (1) max safe limit and (2) max unconditional alloc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
golang patches related to go language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants