Calculate CRC32C faster #65

lizhanhui · 2023-11-09T10:41:52Z

This crate uses CRC32C algorithm from crc crate to check and verify data integrity. Given that every byte written to disk and read from disk is verified to detect potential errors, this function is very hot.

Unfortunately, crc crate only employs table-lookup approach, leaving much performance space to optimize. One CPU core may achieve around 200MiB/s with the compiler's implicit optimization. See metrics from crc32fast. (Note, they are comparing CRC32, CRC32C differs in the polynomials only, perf metrics are close enough).

This repo offers good explanations to the optimization space ahead and this blog marches further, achiving performances all the way to some 50GiB/s per CPU core through using haredware accelerated SIMD intristics: _mm_crc32_u64 and _mm_clmulepi64_si128 in particular.

One more thing, zlib, chrome brower and Java hotspot VM uses the three-interleave-crc32x

With that said, let me know if you guys are intested in this issue and the potential improvement

The text was updated successfully, but these errors were encountered:

timvisee · 2023-11-10T11:02:30Z

Hi @lizhanhui. Yes, if a different CRC implementation gives better performance with the same result then such change would be very much appreciated.

I'd prefer to have the implementation in some crate though, versus building a wildly custom implementation inside our WAL codebase. Maybe crc32fast is already fast enough.

In any case, changes like this are always welcome!

lizhanhui · 2023-11-10T11:20:46Z

Maybe crc32fast is already fast enough.

crc32fast is fast enough for use cases of CRC-32 only and I indeed use it in our project. Unfortunately, it does not support CRC-32C polynomial, aka, your use case.

I'd prefer to have the implementation in some crate though

All right, maybe it's a good idea to build a similar crate for CRC32C.

timvisee · 2023-11-10T11:21:47Z

All right, maybe it's a good idea to build a similar crate for CRC32C.

Would it be worth to try and include it in crc32fast?

lizhanhui · 2023-11-10T11:24:31Z

crc32fast targets 'Fast, SIMD-accelerated CRC32 (IEEE) checksum computation', not sure if it would expand its scope or not. I would drop a message there to check it out.

lizhanhui · 2023-11-10T11:36:10Z

There is a 3-year-old thread discussing support of other polynomials, srijs/rust-crc32fast#9, but the maintainer has not yet responded.

timvisee · 2023-11-10T11:39:05Z

There is a 3-year-old thread discussing support of other polynomials, srijs/rust-crc32fast#9, but the maintainer has not yet responded.

Thank you for checking. In that case it might make sense to work on a new crate.

By the way, if the actual implementation of this ends up being small then it's probably fine to implement it inside WAL. But yes, setting up a new crate will allow others to benefit from this as well.

lizhanhui · 2023-11-16T13:18:40Z

While I am building a crate, I found this crate quite popularly adopted. Even if it still leaves much space to improve, I create a pull request for you guys to review: #66

This comment was marked as resolved.

Sign in to view

agourlay closed this as completed Feb 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Calculate CRC32C faster #65

Calculate CRC32C faster #65

lizhanhui commented Nov 9, 2023

This comment was marked as resolved.

timvisee commented Nov 10, 2023

lizhanhui commented Nov 10, 2023 •

edited

Loading

timvisee commented Nov 10, 2023

lizhanhui commented Nov 10, 2023

lizhanhui commented Nov 10, 2023

timvisee commented Nov 10, 2023

lizhanhui commented Nov 16, 2023

Calculate CRC32C faster #65

Calculate CRC32C faster #65

Comments

lizhanhui commented Nov 9, 2023

This comment was marked as resolved.

timvisee commented Nov 10, 2023

lizhanhui commented Nov 10, 2023 • edited Loading

timvisee commented Nov 10, 2023

lizhanhui commented Nov 10, 2023

lizhanhui commented Nov 10, 2023

timvisee commented Nov 10, 2023

lizhanhui commented Nov 16, 2023

lizhanhui commented Nov 10, 2023 •

edited

Loading