Skip to content

Conversation

tyaps
Copy link

@tyaps tyaps commented Dec 6, 2024

Currently, countingReader.Seek() uses io.ReadFull() to read data, so memory is allocated for all content of rpm file (and content has no meaning here because only offset of reader must be changed).
So while parsing header of big rpm file, a lot of
memory allocates.

Use reading by small chunks.

@tyaps tyaps force-pushed the fix-memory-allocation branch 3 times, most recently from 3749ace to e197fa8 Compare December 6, 2024 06:47
Currently, countingReader.Seek() uses io.ReadFull()
to read data, so memory is allocated for all content
of rpm file (and content has no meaning here because
only offset of reader must be changed).
So while parsing header of big rpm file, a lot of
memory allocates.

Use reading by small chunks.

Signed-off-by: Sergey Tyapkin <[email protected]>
@tyaps tyaps force-pushed the fix-memory-allocation branch from e197fa8 to 3e4227d Compare December 10, 2024 09:24
@tyaps
Copy link
Author

tyaps commented Feb 24, 2025

Hello. Is there a chance for this PR to be merged?))

Comment on lines +149 to +174
totalRead := int64(0)
const chunkSize = int64(1024 * 1024)
buf := make([]byte, chunkSize)

remaining := offset
for remaining > 0 {
// Define chunk size to read
toRead := chunkSize
if remaining < chunkSize {
toRead = remaining
}

n, err := cr.Read(buf[:toRead])
totalRead += int64(n)
remaining -= int64(n)
if err != nil && err != io.EOF {
// If all was read, skip error
if totalRead >= offset {
err = nil
}
return 0, err
}

if err == io.EOF {
break
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for your contribution! This is definitely worth fixing.

I believe this can be addressed much more simply by using io.Discard:

Suggested change
totalRead := int64(0)
const chunkSize = int64(1024 * 1024)
buf := make([]byte, chunkSize)
remaining := offset
for remaining > 0 {
// Define chunk size to read
toRead := chunkSize
if remaining < chunkSize {
toRead = remaining
}
n, err := cr.Read(buf[:toRead])
totalRead += int64(n)
remaining -= int64(n)
if err != nil && err != io.EOF {
// If all was read, skip error
if totalRead >= offset {
err = nil
}
return 0, err
}
if err == io.EOF {
break
}
return io.CopyN(io.Discard, cr, offset)

It may be a tiny bit less efficient but the simplicity is worth it in my opinion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants