-
-
Notifications
You must be signed in to change notification settings - Fork 628
Description
I'm interested in being able to more easily validate the checksums in Boulder's log lines with "outside tooling", as part of our logging pipeline. The biggest hangup to this is the encoding of the checksum, as crc32 is widely supported generally. While I have done it without this change, it seems like a relatively small change that'll make some downstream stuff easier.
More specifically, we're evaluating Clickhouse as a place to store Boulder's logs, and with a standard CRC32, I can do something like select Body, CRC32(Body) as computed, LogAttributes.Checksum as checksum FROM logs WHERE computed != checksum
to find logs that don't match the expected checksum. It's not pretty otherwise.
This adds a new log checksum format without the variable-width integer encoding currently used in Boulder. It's still CRC32, just encoded directly into base64 instead of with an extra layer of varint-encoding. Note that despite using the varint encoding, Boulder always writes the full buffer out, so it's zero-padded to 7 bytes.
This will be a 3-part change:
- Start accepting a new log checksum format (Start accepting a new log checksum format #8413)
- Switch to new format (Switch to new log checksum format #8415)
- Remove support for old format (Remove support for the old log line checksum format #8416)
Each of these changes should land at least one release and several days apart, to ensure we can continue to validate the log checksums in the logs throughout the process.