Fix memory allocation in countingReader #38

tyaps · 2024-12-06T06:40:43Z

Currently, countingReader.Seek() uses io.ReadFull() to read data, so memory is allocated for all content of rpm file (and content has no meaning here because only offset of reader must be changed).
So while parsing header of big rpm file, a lot of
memory allocates.

Use reading by small chunks.

Currently, countingReader.Seek() uses io.ReadFull() to read data, so memory is allocated for all content of rpm file (and content has no meaning here because only offset of reader must be changed). So while parsing header of big rpm file, a lot of memory allocates. Use reading by small chunks. Signed-off-by: Sergey Tyapkin <[email protected]>

tyaps · 2025-02-24T06:50:25Z

Hello. Is there a chance for this PR to be merged?))

mtharp · 2025-03-18T16:14:52Z

cpio/cpio.go

+	totalRead := int64(0)
+	const chunkSize = int64(1024 * 1024)
+	buf := make([]byte, chunkSize)
+
+	remaining := offset
+	for remaining > 0 {
+		// Define chunk size to read
+		toRead := chunkSize
+		if remaining < chunkSize {
+			toRead = remaining
+		}
+
+		n, err := cr.Read(buf[:toRead])
+		totalRead += int64(n)
+		remaining -= int64(n)
+		if err != nil && err != io.EOF {
+			// If all was read, skip error
+			if totalRead >= offset {
+				err = nil
+			}
+			return 0, err
+		}
+
+		if err == io.EOF {
+			break
+		}


Hi, thanks for your contribution! This is definitely worth fixing.

I believe this can be addressed much more simply by using io.Discard:

Suggested change

totalRead := int64(0)

const chunkSize = int64(1024 * 1024)

buf := make([]byte, chunkSize)

remaining := offset

for remaining > 0 {

// Define chunk size to read

toRead := chunkSize

if remaining < chunkSize {

toRead = remaining

}

n, err := cr.Read(buf[:toRead])

totalRead += int64(n)

remaining -= int64(n)

if err != nil && err != io.EOF {

// If all was read, skip error

if totalRead >= offset {

err = nil

}

return 0, err

}

if err == io.EOF {

break

}

return io.CopyN(io.Discard, cr, offset)

It may be a tiny bit less efficient but the simplicity is worth it in my opinion.

tyaps force-pushed the fix-memory-allocation branch 3 times, most recently from 3749ace to e197fa8 Compare December 6, 2024 06:47

tyaps force-pushed the fix-memory-allocation branch from e197fa8 to 3e4227d Compare December 10, 2024 09:24

mtharp reviewed Mar 18, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix memory allocation in countingReader #38

Fix memory allocation in countingReader #38

Uh oh!

tyaps commented Dec 6, 2024

Uh oh!

tyaps commented Feb 24, 2025

Uh oh!

mtharp Mar 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix memory allocation in countingReader #38

Are you sure you want to change the base?

Fix memory allocation in countingReader #38

Uh oh!

Conversation

tyaps commented Dec 6, 2024

Uh oh!

tyaps commented Feb 24, 2025

Uh oh!

mtharp Mar 18, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants