Skip to content

refactor: Switch Arrow/Orjson serializers from Blake3 to xxHash3 #1

@27Bslash6

Description

@27Bslash6

Summary

Arrow and Orjson serializers use checksums for integrity checking. This issue tracks alignment with the Rust ByteStorage layer.

Current State (2025-12-11)

Phase 1 Complete: Switched to xxHash3-64 via Python xxhash package

Serializer Checksum Size Implementation
StandardSerializer xxHash3-64 8 bytes Rust ByteStorage (FFI)
ArrowSerializer xxHash3-64 8 bytes Python xxhash package
OrjsonSerializer xxHash3-64 8 bytes Python xxhash package

Files updated:

  • src/cachekit/serializers/arrow_serializer.py
  • src/cachekit/serializers/orjson_serializer.py
  • Tests: test_xxhash_integrity.py (14 new tests), updated existing tests

Future Work: FFI Implementation

🔮 Phase 2 (Optional): Use Rust FFI for checksums instead of Python package

Blocked by: cachekit-io/cachekit-core#13 (checksum-only API)

# Current (Python xxhash)
import xxhash
checksum = xxhash.xxh3_64_digest(data)

# Future (Rust FFI) - requires cachekit-core#13
from cachekit._rust_serializer import compute_checksum
checksum = compute_checksum(data)

Benefits of FFI approach:

  • Single implementation (no Python xxhash dependency)
  • Consistent with StandardSerializer path
  • Potentially faster for large payloads (avoid Python GIL)

Trade-offs:

  • FFI overhead may negate speed gains for small payloads
  • More complex build (Rust required)
  • Current Python solution works fine

Decision Log

  • 2025-12-11: Implemented Phase 1 (Python xxhash). Phase 2 deferred pending cachekit-core#13 and benchmarking to determine if FFI overhead is worth it.

Related

  • Upstream: cachekit-core#13 (checksum-only API in Rust)
  • Context: xxHash3 migration in ByteStorage (2025-12-05)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions