-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-18296. Memory fragmentation in ChecksumFileSystem readVectored() #7732
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: branch-3.4
Are you sure you want to change the base?
HADOOP-18296. Memory fragmentation in ChecksumFileSystem readVectored() #7732
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR addresses memory fragmentation issues in the vectored IO implementation within ChecksumFileSystem by introducing configurable checksum verification, enhanced logging, and improved buffer management. Key changes include:
- Adding a configurable option (file.verify-checksum) and corresponding accessor in LocalFileSystem and ChecksumFileSystem.
- Introducing a new TrackingByteBufferPool for tracking buffer usage.
- Updating vectored IO logic and related documentation, including buffer slicing and new capability definitions.
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
TestViewFileSystemDelegation.java | Added an @OverRide annotation to the getVerifyChecksum() method. |
fsdatainputstream.md | Updated and expanded documentation sections covering direct buffer reads and buffer slicing. |
Sizes.java | Introduced a new constant S_0 for representing 0 bytes. |
TrackingByteBufferPool.java | Added a new buffer pool implementation to track allocations and releases. |
VectoredReadUtils.java | Added a new hasVectorIOCapability() method that supports vectored IO capability lookup. |
StreamCapabilities.java | Added a new constant VECTOREDIO_BUFFERS_SLICED and updated interface documentation. |
RawLocalFileSystem.java | Updated capability handling and logging for vectored read operations. |
LocalFileSystemConfigKeys.java | Introduced the LOCAL_FS_VERIFY_CHECKSUM config key with documentation. |
LocalFileSystem.java | Integrated checksum verification setting during initialization with logging. |
ChecksumFileSystem.java | Added a getVerifyChecksum() method, updated vectored read logic and capability handling. |
Comments suppressed due to low confidence (4)
hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md:692
- [nitpick] The sentence in the Buffer Slicing section appears to have an extraneous comma and could be rephrased for clarity. Consider revising it to clearly state the conditions under which buffer slicing is applied.
those with checksums,
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/Sizes.java:35
- [nitpick] The constant name 'S_0' may be unclear at first glance. Consider renaming it to 'ZERO_BYTES' for improved readability.
public static final int S_0 = 0;
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ChecksumFileSystem.java:1186
- Ensure that returning getVerifyChecksum() for the VECTOREDIO_BUFFERS_SLICED capability correctly reflects the intended semantics for when buffers are sliced. It may be beneficial to explicitly document this behavior or consider whether a separate flag is more appropriate.
case StreamCapabilities.VECTOREDIO_BUFFERS_SLICED:
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/VectoredReadUtils.java:488
- Consider expanding the hasVectorIOCapability() method to either support the 'fs.capability.vectoredio.sliced' capability or document that it intentionally only handles 'in:readvectored'.
switch (capability.toLowerCase(Locale.ENGLISH)) {
a19a65f
to
86d6bac
Compare
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
trying to put this together. No attempts to address, just
|
Wire up TrackingByteBufferPool to vector read tests, with test suite AbstractContractVectoredReadTest adding test testBufferSlicing() to generate conditions which may trigger slicing. Only files which declare that they may slicing buffers are permitted to return buffers to the pool which aren't known about, or to close the pool with outstanding entries. So: no fix, just public declaration of behavior and test to verify that no other streams are doing it.
1096112
to
995fcc4
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
|
* <p> | ||
* {@value}. | ||
*/ | ||
public static final String LOCAL_FS_VERIFY_CHECKSUM = "file.verify-checksum"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename fs.file.checksum.very and delclare in core-defaults to make more visible
This comment was marked as outdated.
This comment was marked as outdated.
* The key maps by the object id of the buffer, and refers to either a common stack trace | ||
* or one dynamically created for each allocation. | ||
*/ | ||
private final Map<Key, ByteBufferAllocationStacktraceException> allocated = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be replaced with java.util.IdentityHashMap
, which will make the Key
class redundant:
private final Map<Key, ByteBufferAllocationStacktraceException> allocated = | |
private final Map<ByteBuffer, ByteBufferAllocationStacktraceException> allocated = new IdentityHashMap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense; I just lifted the parquet code
🎊 +1 overall
This message was automatically generated. |
@@ -400,7 +398,9 @@ private void initiateRead() { | |||
for(int i = 0; i < ranges.size(); ++i) { | |||
FileRange range = ranges.get(i); | |||
buffers[i] = allocateRelease.getBuffer(false, range.getLength()); | |||
channel.read(buffers[i], range.getOffset(), i, this); | |||
final ByteBuffer buffer = buffers[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any specific reason for this refactoring and making it final?
into the file formats themselves, or the underlying storage system. | ||
Even when verification is enabled, files without associated checksum files | ||
.$FILENAME.crc are never be verified. | ||
When checksum verification is disabled, vector reads of date will always returne |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: returne
The local filesystem will not slice buffers if the checksum file | ||
of `filename + ".crc"` is not found. This is not declared in the | ||
filesystem `hasPathCapability(filename, "fs.capability.vectoredio.sliced")` | ||
call, as no checks for the checksum file are made then.ddddddddddddddddd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: then.ddddddddddddddddd
LGTM overall pending comments about typos. The expected message was emitted on AbstractContractVectoredReadTest.
The modified ITestS3AContractVectoredRead passed against Tokyo region without messages mentioning sliced/leaked buffers.
|
The failure of TestCommonConfigurationFields looks caused by the property added to core-default.xml.
|
* Configuring this class to log at DEBUG will trigger their collection. | ||
* @see ByteBufferAllocationStacktraceException | ||
* <p> | ||
* Adapted from Parquet class {@code org.apache.parquet.bytes.TrackingByteBufferAllocator}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest mentioning which version of Parquet you adapted this from. In the future, that will help us understand if our copy needs to pick up additional bug fixes.
reading data may not be detected during the read operation. | ||
Use with care in production.) | ||
|
||
Filesystem instances which spit buffersduring vector read operations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: "...which split buffers during..."
@@ -42,5 +42,6 @@ public class LocalFileSystemConfigKeys extends CommonConfigurationKeys { | |||
public static final String LOCAL_FS_CLIENT_WRITE_PACKET_SIZE_KEY = | |||
"file.client-write-packet-size"; | |||
public static final int LOCAL_FS_CLIENT_WRITE_PACKET_SIZE_DEFAULT = 64*1024; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary change?
return true; | ||
default: | ||
return false; | ||
return hasVectorIOCapability(capability); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this actually change the logic? It seems like hasVectorIOCapability
checks StreamCapabilities.VECTOREDIO
, just like the old code did.
* <p> | ||
* Adapted from Parquet class {@code org.apache.parquet.bytes.TrackingByteBufferAllocator}. | ||
*/ | ||
public final class TrackingByteBufferPool implements ByteBufferPool, AutoCloseable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we put visibility annotations on this?
Alternatively, if it's only ever intended for use in tests, should it go into the test path instead of main?
Description of PR
file.verify-checksum
TrackingByteBufferAllocator
, modified for Hadoop APIs and namedTrackingByteBufferPool
; not yet used in tests.How was this patch tested?
no tests yet.
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?