Skip to content

Switch to "regular" CRC32 for the log line checksums #8414

@mcpherrinm

Description

@mcpherrinm

I'm interested in being able to more easily validate the checksums in Boulder's log lines with "outside tooling", as part of our logging pipeline. The biggest hangup to this is the encoding of the checksum, as crc32 is widely supported generally. While I have done it without this change, it seems like a relatively small change that'll make some downstream stuff easier.

More specifically, we're evaluating Clickhouse as a place to store Boulder's logs, and with a standard CRC32, I can do something like select Body, CRC32(Body) as computed, LogAttributes.Checksum as checksum FROM logs WHERE computed != checksum to find logs that don't match the expected checksum. It's not pretty otherwise.

This adds a new log checksum format without the variable-width integer encoding currently used in Boulder. It's still CRC32, just encoded directly into base64 instead of with an extra layer of varint-encoding. Note that despite using the varint encoding, Boulder always writes the full buffer out, so it's zero-padded to 7 bytes.

This will be a 3-part change:

Each of these changes should land at least one release and several days apart, to ensure we can continue to validate the log checksums in the logs throughout the process.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions