-
Notifications
You must be signed in to change notification settings - Fork 22
feat: Introduce sst file format for btree global index #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| size_t seed = 0; | ||
| seed ^= | ||
| std::hash<std::string>{}(file_path_) + 0x9e3779b97f4a7c15ULL + (seed << 6) + (seed >> 2); | ||
| seed ^= std::hash<int64_t>{}(position_) + 0x9e3779b97f4a7c15ULL + (seed << 6) + (seed >> 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider converting 0x9e3779b97f4a7c15ULL to a static constant
| std::function<Result<MemorySegment>(const std::shared_ptr<CacheKey>&)> reader) { | ||
| auto& cache = key->IsIndex() ? index_cache_ : data_cache_; | ||
| auto supplier = [&reader](const std::shared_ptr<CacheKey>& k) -> std::shared_ptr<CacheValue> { | ||
| auto segment = reader(k).value(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capturing &reader by reference may risk memory leaks. May consider capturing by value instead. Also ensure Result is .ok() before calling .value() to avoid undefined behavior.
| } | ||
|
|
||
| return false; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a test for BlockReader and BlockIterator.
| std::string BlockTrailer::ToString() const { | ||
| return "BlockTrailer{compression_type=" + std::to_string(compression_type_) + ", crc32c_=0x" + | ||
| std::to_string(crc32c_) + "}"; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May add a conversion to ensure the return value is a hexadecimal string for crc32c.
| if (aligned_) { | ||
| int currentSize = endPosition - startPosition; | ||
| if (aligned_size_ == 0) { | ||
| aligned_size_ = currentSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
current_size & end_position & start_position
| BlockWriter* IndexWriter() const { | ||
| return index_block_writer_.get(); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test interface can be private.
| if (!handle.ok()) { | ||
| return handle.status(); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May use PAIMON_ASSIGN_OR_RAISE() (ps: Ensure that the macro PAIMON_ASSIGN_OR_RAISE always includes the return type not auto ).
|
|
||
| private: | ||
| static constexpr int32_t BYTE_SIZE = 8; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the existing BloomFilter64 and BitSet classes in paimon/common/utils/ meet your requirements for bloom filter file index?
|
|
||
| bool IsIndex() override; | ||
|
|
||
| int64_t Position(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get method can be const.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces the SST (Sorted String Table) file format infrastructure for the btree global index. It implements core read/write functionality for SST files including memory management utilities (MemorySlice, MemorySliceInput/Output), data structures (BloomFilter, BitSet), block-based storage (BlockWriter/Reader, BlockIterator), and a basic cache interface. The implementation focuses on fundamental functionality, deferring compression encoding and LRU caching to future work.
Changes:
- Added memory slice abstractions for efficient I/O operations on memory segments
- Implemented SST file format with block-based storage including writer, reader, and iterator components
- Introduced BloomFilter and BitSet data structures for efficient key lookups
- Added cache interface with placeholder NoCache implementation
Reviewed changes
Copilot reviewed 38 out of 38 changed files in this pull request and generated 16 comments.
Show a summary per file
| File | Description |
|---|---|
| src/paimon/common/utils/bloom_filter.{h,cpp} | Segment-based Bloom filter implementation for probabilistic key membership testing |
| src/paimon/common/utils/bloom_filter_test.cpp | Comprehensive tests for Bloom filter including hash functions and multi-segment scenarios |
| src/paimon/common/utils/bit_set.{h,cpp} | Memory segment-backed bit set implementation used by Bloom filter |
| src/paimon/common/utils/bit_set_test.cpp | Tests for BitSet operations including set, get, and clear |
| src/paimon/common/sst/sst_file_writer.{h,cpp} | SST file writer with block and Bloom filter serialization |
| src/paimon/common/sst/sst_file_reader.{h,cpp} | SST file reader with point and range query support using block iterators |
| src/paimon/common/sst/sst_file_io_test.cpp | Integration test for SST file write and read with Bloom filter validation |
| src/paimon/common/sst/block_writer.{h,cpp} | Block writer supporting aligned/unaligned key-value pair storage |
| src/paimon/common/sst/block_reader.{h,cpp} | Block reader with support for both aligned and unaligned block formats |
| src/paimon/common/sst/block_iterator.{h,cpp} | Iterator for sequential and binary search access to block entries |
| src/paimon/common/sst/block_handle.{h,cpp} | Metadata structure for block location and size with serialization |
| src/paimon/common/sst/block_entry.h | Simple key-value pair container for block entries |
| src/paimon/common/sst/block_trailer.{h,cpp} | Block metadata including CRC32 checksum and compression type |
| src/paimon/common/sst/block_cache.{h,cpp} | Block cache for reading SST blocks with position-based caching |
| src/paimon/common/sst/block_aligned_type.h | Enumeration for aligned vs unaligned block storage modes |
| src/paimon/common/sst/bloom_filter_handle.h | Metadata for Bloom filter location in SST files |
| src/paimon/common/memory/memory_slice.{h,cpp} | Slice abstraction over memory segments with comparison operators |
| src/paimon/common/memory/memory_slice_input.{h,cpp} | Input stream over memory slices with variable-length integer support |
| src/paimon/common/memory/memory_slice_output.{h,cpp} | Output stream over memory slices with auto-resize capability |
| src/paimon/common/io/cache/cache.{h,cpp} | Cache interface with NoCache placeholder implementation |
| src/paimon/common/io/cache/cache_key.{h,cpp} | Position-based cache key with hash support |
| src/paimon/common/io/cache/cache_manager.{h,cpp} | Cache manager coordinating data and index caches |
| src/paimon/CMakeLists.txt | Build configuration updates for new SST and memory components |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/paimon/common/io/cache/cache.h
Outdated
| return segment_; | ||
| } | ||
|
|
||
| public: |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Public member variables in CacheValue break encapsulation. The segment_ field should be private with GetSegment() as the only accessor to maintain proper encapsulation.
| public: | |
| private: |
| /// BitSet based on MemorySegment. | ||
| class PAIMON_EXPORT BitSet { | ||
| public: | ||
| explicit BitSet(int64_t byte_length) : byte_length_(byte_length), bit_size_(byte_length << 3) {} |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The offset_ member variable is not initialized in the constructor. If SetMemorySegment is never called before using methods like Set or Get, offset_ will contain garbage values, leading to undefined behavior.
src/paimon/common/sst/block_cache.h
Outdated
| public: | ||
| std::string file_path_; | ||
| std::shared_ptr<InputStream> in_; | ||
| std::shared_ptr<MemoryPool> pool_; | ||
|
|
||
| std::unique_ptr<CacheManager> cache_manager_; | ||
| std::unordered_map<std::shared_ptr<CacheKey>, std::shared_ptr<CacheValue>> blocks_; |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Public member variables file_path_, in_, pool_, cache_manager_, and blocks_ in BlockCache break encapsulation. These should be private with appropriate getter methods if external access is needed.
Purpose
#38
This PR only covers basic reading and writing of SST format files, data compression encoding and caching are not considered. Too much code in the PR will make it difficult to review, so I will continue work for this in the future.
DONE:
TODO:
Tests
./build/debug/paimon-common-sst-file-format-test
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from SstFileIOTest
[ RUN ] SstFileIOTest.TestSimple
[ OK ] SstFileIOTest.TestSimple (2 ms)
[----------] 1 test from SstFileIOTest (2 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (2 ms total)
[ PASSED ] 1 test.
./build/debug/paimon-common-test
[----------] 1 test from BitSetTest
[ RUN ] BitSetTest.TestBitSet
[ OK ] BitSetTest.TestBitSet (0 ms)
[----------] 1 test from BitSetTest (0 ms total)
[----------] 5 tests from BloomFilterTest
[ RUN ] BloomFilterTest.TestOneSegmentBuilder
[ OK ] BloomFilterTest.TestOneSegmentBuilder (0 ms)
[ RUN ] BloomFilterTest.TestEstimatedHashFunctions
[ OK ] BloomFilterTest.TestEstimatedHashFunctions (0 ms)
[ RUN ] BloomFilterTest.TestBloomNumBits
[ OK ] BloomFilterTest.TestBloomNumBits (0 ms)
[ RUN ] BloomFilterTest.TestBloomNumHashFunctions
[ OK ] BloomFilterTest.TestBloomNumHashFunctions (0 ms)
[ RUN ] BloomFilterTest.TestBloomFilter
[ OK ] BloomFilterTest.TestBloomFilter (2 ms)
[----------] 5 tests from BloomFilterTest (2 ms total)