-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Made chunk reading explicit when using read or pread #2772
base: main
Are you sure you want to change the base?
Conversation
30b9cec
to
7baadcc
Compare
case filePointerToEnd | ||
case range(Range<Int64>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's any reason to change these names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't entireFile
misleading because it read from the current file pointer, not from the beginning of the file?
/// Returns an asynchronous sequence of chunks read from the file starting from the current file pointer. | ||
/// | ||
/// - Parameters: | ||
/// - size: The maximum length of the chunk to read as a ``ByteCount``. | ||
/// - Returns: A sequence of chunks read from the file. | ||
func readChunksFromFilePointer(chunkLength size: ByteCount) -> FileChunks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't quite what I had in mind. Rather than adding new API I think we should use the type of the file to determine how to do the read inside FileChunks
. Once we know the type of the file we can determine whether the range passed in is acceptable and then call the appropriate read function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would make the checks that happen inside of .readToEnd
redundant. Should we keep those or remove them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also means that calling readToEnd will stat the file twice.
I’m trying out your recommended solution and it also causes problems because if we check the file type in the FileChunks initializer, then that means the function has to be async throws. But the readChunks function from ReadableFileHandleProtocol
is neither async or throws so that would be an API change.
Resolved an issue where reading a file in chunks using an unbounded range would read from the current file pointer even for regular files.
Motivation:
When a
FileChunks
object is initialized, it checks if the range is set to0..<Int.max
. If it is, then instead of reading using a series of offsets and thepread
sys call, it uses theread
syscall repeatedly. This is fine if the read is the first ever of this type for a file, but if we read this way twice, then the second call will be affected by the side file pointer effect from the first call.For example:
The main issue is that the
read
syscall affects the file pointer while thepread
syscall does not.Modifications:
readChunksFromFilePointer
toReadableFileHandleProtocol
to explicitly read from the current file pointer instead of relying on the magic0..<Int.max
range..readToEnd
.ChunkRange
cases to make what they are doing clearer.Result:
Reading
0..<Int.max
over and over again will have the same result each time.