-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++][Parquet] Reading corrupted encrypted Parquet files can cause a segfault #43070
Labels
Component: C++
Component: Parquet
Critical Fix
Bugfixes for security vulnerabilities, crashes, or invalid data.
Type: bug
Milestone
Comments
Wow, this is horrid. Thanks for noticing this @adamreeve . |
pitrou
added
the
Critical Fix
Bugfixes for security vulnerabilities, crashes, or invalid data.
label
Jun 27, 2024
mapleFU
pushed a commit
that referenced
this issue
Jul 4, 2024
… segfault (#43071) ### Rationale for this change See #43070 ### What changes are included in this PR? Checks that the ciphertext length is at least enough to hold the length (if written), nonce and GCM tag for the GCM cipher type. Also enforces that the input ciphertext length parameter is provided (is > 0) and verifies that the ciphertext size read from the file isn't going to cause reads beyond the end of the ciphertext buffer. ### Are these changes tested? Yes I've added new unit tests for this. ### Are there any user-facing changes? No * GitHub Issue: #43070 Authored-by: Adam Reeve <[email protected]> Signed-off-by: mwish <[email protected]>
Issue resolved by pull request 43071 |
raulcd
pushed a commit
that referenced
this issue
Jul 4, 2024
… segfault (#43071) ### Rationale for this change See #43070 ### What changes are included in this PR? Checks that the ciphertext length is at least enough to hold the length (if written), nonce and GCM tag for the GCM cipher type. Also enforces that the input ciphertext length parameter is provided (is > 0) and verifies that the ciphertext size read from the file isn't going to cause reads beyond the end of the ciphertext buffer. ### Are these changes tested? Yes I've added new unit tests for this. ### Are there any user-facing changes? No * GitHub Issue: #43070 Authored-by: Adam Reeve <[email protected]> Signed-off-by: mwish <[email protected]>
zanmato1984
pushed a commit
to zanmato1984/arrow
that referenced
this issue
Jul 9, 2024
…revent segfault (apache#43071) ### Rationale for this change See apache#43070 ### What changes are included in this PR? Checks that the ciphertext length is at least enough to hold the length (if written), nonce and GCM tag for the GCM cipher type. Also enforces that the input ciphertext length parameter is provided (is > 0) and verifies that the ciphertext size read from the file isn't going to cause reads beyond the end of the ciphertext buffer. ### Are these changes tested? Yes I've added new unit tests for this. ### Are there any user-facing changes? No * GitHub Issue: apache#43070 Authored-by: Adam Reeve <[email protected]> Signed-off-by: mwish <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Component: C++
Component: Parquet
Critical Fix
Bugfixes for security vulnerabilities, crashes, or invalid data.
Type: bug
Describe the bug, including details regarding any error messages, version, and platform.
We experienced a segfault reading an encrypted Parquet file and traced this down to
EVP_DecryptUpdate
being called here with a negative value for the last parameter, which is the ciphertext length.By debugging the problem I found that the length was being read from the file as zero, which became negative after subtracting
length_buffer_length_
,kNonceLength
andkGcmTagLength
.This file was written with PyArrow but we suspect it may have become corrupted somehow. I would expect that this should result in a catchable exception rather than a segfault though.
Component(s)
C++, Parquet
The text was updated successfully, but these errors were encountered: