Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write a constant value to a file without buffering #2874

Merged
merged 9 commits into from
Jul 22, 2024

Conversation

imaihal
Copy link
Collaborator

@imaihal imaihal commented Jul 12, 2024

This PR partially solves an issue of memory consumption reported in #2821

We found that there is a spike in memory consumption when writing a constant into a file (model.constants.bin). This is because all constants into a buffer once and write it to a file at once. This PR changes to write the constant to the file without the buffering. This removes the spike in memory consumption.

@AlexandreEichenberger
Copy link
Collaborator

It looks good, I was interested to see if using mmap would be faster that the ofstream.write. A quick benchmark run on z16 gave me the following.

Test with size: 128
File write duration: 0.0127008 seconds, BW: 10078.1 b/s
mmap write duration: 0.01229 seconds, BW: 10415 b/s

Test with size: 1024
File write duration: 0.0122701 seconds, BW: 83454.8 b/s
mmap write duration: 0.0121558 seconds, BW: 84239.6 b/s

Test with size: 262144
File write duration: 0.00024563 seconds, BW: 1.06723e+09 b/s
mmap write duration: 0.000119349 seconds, BW: 2.19645e+09 b/s

Test with size: 1048576
File write duration: 0.000903089 seconds, BW: 1.1611e+09 b/s
mmap write duration: 0.000410608 seconds, BW: 2.55372e+09 b/s

Test with size: 10485760
File write duration: 0.00851007 seconds, BW: 1.23216e+09 b/s
mmap write duration: 0.0046617 seconds, BW: 2.24934e+09 b/s

Test with size: 104857600
File write duration: 0.0864244 seconds, BW: 1.21329e+09 b/s
mmap write duration: 0.0501027 seconds, BW: 2.09285e+09 b/s

Where tests were done for 128B, 1KB, 256KB, 1MB, 10MB, and 100MB. For all but the smallest sizes, mmap is 1.8x faster. Now this test wrote the whole buffer in one go, but it is possible to mmap a portion of the file.

We can go with this PR as of now. It would be interesting to see what are the average size of the constant being written (maybe add a printf and run it through the large benchmark), and let see if might be worthwhile to buffer the writes using mmap.

@imaihal
Copy link
Collaborator Author

imaihal commented Jul 17, 2024

@AlexandreEichenberger Thanks for you suggestion! This PR focuses on peak memory reduction. May I use/try mmap in a follow-up PR focused on performance?

@AlexandreEichenberger
Copy link
Collaborator

May I use/try mmap in a follow-up PR focused on performance?

For sure. When I saw your PR, I just got curious.

std::vector<char> packedConst;
llvm::sys::fs::remove(filepath);
std::ofstream outfile(filepath, std::ios::app | std::ios::binary);
int64_t totalConstSize = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint64_t is more suitable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks!

for (int64_t i = globalOfInterest.size() - 1; i >= 0; --i) {
std::vector<char> packedConst;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name packedConst is obsoleted in the new context since we donnot pack constants anymore. Maybe change it to paddedConstant or it can be totally removed (See the below comment)

((uint64_t)(packedConst.size() / alignment) + 1) * alignment -
packedConst.size();
((uint64_t)(totalConstSize / alignment) + 1) * alignment -
totalConstSize;
SmallVector<char> pads(padSize, (char)0);
packedConst.insert(packedConst.end(), pads.begin(), pads.end());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can reduce memory more by not using a padded buffer but two writes to a file: one for padding and one for rawData. For example.

// Padding the current data for correct alignment.
uint64_t padSize = 0;
if ((alignment > 0) && (totalConstSize % alignment != 0)) {
 padSize = ((uint64_t)(totalConstSize / alignment) + 1) * alignment - totalConstSize;
 SmallVector<char> pads(padSize, (char)0);
 outfile.write(pads.data(), pads.size());
}

// Write the constant to a file.
uint64_t constSize = rawData.size();
outfile.write(rawData.data(), constSize);

// Update the total size and op's offset.
totalConstSize += padSize + constSize;
op.setOffsetAttr(b.getI64IntegerAttr(totalConstSize));
op.removeValueAttr();

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smart, please do that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comments! I updated.

totalConstSize in op.setOffsetAttr(b.getI64IntegerAttr(totalConstSize)) is offset. So the totalConstSize before adding constSize should be used.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Collaborator

@tungld tungld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@imaihal imaihal merged commit faf79bd into onnx:main Jul 22, 2024
7 checks passed
@imaihal imaihal deleted the mem_reduction branch July 22, 2024 02:45
@jenkins-droid
Copy link
Collaborator

Jenkins Linux s390x Build #15178 [push] Write a constant value t... started at 22:46

@jenkins-droid
Copy link
Collaborator

Jenkins Linux ppc64le Build #14203 [push] Write a constant value t... started at 22:58

@jenkins-droid
Copy link
Collaborator

Jenkins Linux amd64 Build #15173 [push] Write a constant value t... started at 21:46

@jenkins-droid
Copy link
Collaborator

Jenkins Linux amd64 Build #15173 [push] Write a constant value t... passed after 1 hr 5 min

@jenkins-droid
Copy link
Collaborator

Jenkins Linux s390x Build #15178 [push] Write a constant value t... passed after 1 hr 28 min

@jenkins-droid
Copy link
Collaborator

Jenkins Linux ppc64le Build #14203 [push] Write a constant value t... passed after 1 hr 42 min

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants