Skip to content

[ntuple] Save a memcpy when reading a compressed Anchor in RMiniFile #18612

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 11 additions & 9 deletions tree/ntuple/src/RMiniFile.cxx
Original file line number Diff line number Diff line change
Expand Up @@ -763,25 +763,27 @@ ROOT::RResult<ROOT::RNTuple> ROOT::Internal::RMiniFileReader::GetNTupleProper(st
if (key.fObjLen < kMinNTupleSize) {
return R__FAIL("invalid anchor size: " + std::to_string(key.fObjLen) + " < " + std::to_string(sizeof(RTFNTuple)));
}

// The object length can be smaller than the size of RTFNTuple if it comes from a past RNTuple class version,
// or larger than it if it comes from a future RNTuple class version.
auto bufAnchor = MakeUninitArray<unsigned char>(std::max<size_t>(key.fObjLen, sizeof(RTFNTuple)));
RTFNTuple *ntuple = new (bufAnchor.get()) RTFNTuple;

auto objNbytes = key.GetSize() - key.fKeyLen;
ReadBuffer(ntuple, objNbytes, offset);
const auto objNbytes = key.GetSize() - key.fKeyLen;
if (objNbytes != key.fObjLen) {
// Decompress into a temporary buffer
auto unzipBuf = MakeUninitArray<unsigned char>(key.fObjLen);
RNTupleDecompressor::Unzip(bufAnchor.get(), objNbytes, key.fObjLen, unzipBuf.get());
// Then copy back to bufAnchor
memcpy(bufAnchor.get(), unzipBuf.get(), key.fObjLen);
// Read into a temporary buffer
auto unzipBuf = MakeUninitArray<unsigned char>(std::max<size_t>(key.fObjLen, sizeof(RTFNTuple)));
ReadBuffer(unzipBuf.get(), objNbytes, offset);
// Unzip into the final buffer
RNTupleDecompressor::Unzip(unzipBuf.get(), objNbytes, key.fObjLen, ntuple);
} else {
ReadBuffer(ntuple, objNbytes, offset);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really think this is on the critical path, and now we have two calls to ReadBuffer...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean two calls? They're mutually exclusive so we only have one; we're simply doing one less operation in one of the branches.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two lines of code where we call ReadBuffer, which we will have to make sure updated in sync. That current code is more maintainable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is ReadBuffer harder to maintain than memcpy? That too needs to be synchronized, and both are trivial things to do

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipping a useless mempcy can be a significant performance difference, for example in case of tight looping over a single branch and is worth the slight extra complexity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is about the anchor which is read once when opening the file. There is simply no performance implication in that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which btw also means that the amount of memory copied is 78 bytes...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed that ... indeed for such a low size ... even doing the compression is debatable ....

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For RMiniFile, this was explicitly implemented in 7b03bcc. I believe TFile always attempts compression and goes with it if beneficial, ie compressed size smaller than uncompressed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is not about performance at all, it's about code simplicity and straightforwardness. This code does less work and therefore is simpler to understand. However if I'm the only one who thinks so, this PR can be closed and not much of value would be lost, being such a small change.

}

// We require that future class versions only append members and store the checksum in the last 8 bytes
// Checksum calculation: strip byte count, class version, fChecksum member
auto lenCkData = key.fObjLen - ntuple->GetOffsetCkData() - sizeof(uint64_t);
auto ckCalc = XXH3_64bits(ntuple->GetPtrCkData(), lenCkData);
const auto lenCkData = key.fObjLen - ntuple->GetOffsetCkData() - sizeof(uint64_t);
const auto ckCalc = XXH3_64bits(ntuple->GetPtrCkData(), lenCkData);
uint64_t ckOnDisk;

RUInt64BE *ckOnDiskPtr = reinterpret_cast<RUInt64BE *>(bufAnchor.get() + key.fObjLen - sizeof(uint64_t));
Expand Down
Loading