Description
Describe the bug
FileWriter
and StreamWriter
should ensure that the data is written with appropriate alignment such that arrays can be used without copying to a more-aligned buffer.
In particular, as of Rust 1.77.0 and LLVM 18, i128
now has a 16-byte alignment requirement even on x86 (ARM always had this requirement), i.e. std::mem::align_of::<i128> == 16
. So Decimal128Array
s must be aligned to a 16-byte boundary when serialized into an IPC buffer. The pad_to_8
used everywhere in the IPC code causes it to pad insufficiently.
This prevents readers of the IPC data generated by this crate from doing true zero-copy reads (e.g. mmapping) since the data may be insufficiently aligned.
On some platforms, SIMD may also be significantly slower if the beginning of the IPC block isn't aligned to a 16-, 32-, or 64- byte boundary (as discussed in the Arrow spec document).
To Reproduce
See the test test_decimal128_alignment8_is_unaligned
in PR #5554 - the fact that this test throws an error shows that alignment
is not currently respected.
Expected behavior
See the test test_decimal128_alignment16
in PR #5554 - increasing alignment should allow us to do "true" zero-copy reads.
Additional context
IpcWriteOptions
already has an "alignment" field but it is not being respected throughout the IPC code.
Related PRs and issues: