-
Notifications
You must be signed in to change notification settings - Fork 22
feat: Introduce sst file format for btree global index #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
33 commits
Select commit
Hold shift + click to select a range
f01566f
feat: introduce sst file format for btree global index
ChaomingZhangCN 3a8f42d
feat: introduce sst file format for btree global index
ChaomingZhangCN 96ec401
fix bloom filter
ChaomingZhangCN 69e1434
minor fix
ChaomingZhangCN 95e2012
fix compile
ChaomingZhangCN e0c876e
minor fix
ChaomingZhangCN 73bc0c2
fix compile
ChaomingZhangCN 0fdc643
Merge branch 'main' into sst-file-format
ChaomingZhangCN 3ffcf50
add virtual
ChaomingZhangCN e08c162
address
ChaomingZhangCN 592df79
Merge branch 'main' into sst-file-format
ChaomingZhangCN 573f843
address
a5c8c53
address
a56c96c
fix tests
757d212
fix tests
a0cb2a7
fix tests
bfa93f4
fix tests
dfe0527
fix tests
a389fb0
Merge branch 'main' into sst-file-format
lucasfang 4e43fe4
address
ChaomingZhangCN 384e1da
Merge branch 'sst-file-format' of https://github.com/ChaomingZhangCN/…
ChaomingZhangCN 672d0ff
clang lint
ChaomingZhangCN 779440c
clang lint
ChaomingZhangCN 0916877
Merge branch 'main' into sst-file-format
ChaomingZhangCN 9ab4a91
update copyright
ChaomingZhangCN 447b04f
check byte order
ChaomingZhangCN 3ab2c04
address
ChaomingZhangCN 7eadb8a
Merge branch 'main' into sst-file-format
lxy-9602 734f886
add tests
ChaomingZhangCN 46a06a4
minor fix
ChaomingZhangCN 065be74
minor fix
ChaomingZhangCN 29ec989
Merge branch 'main' into sst-file-format
ChaomingZhangCN 53e04ea
fix clang test failed
ChaomingZhangCN File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
| /* | ||
| * Copyright 2026-present Alibaba Inc. | ||
| * | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| #include "paimon/common/io/cache/cache.h" | ||
|
|
||
| namespace paimon { | ||
| std::shared_ptr<CacheValue> NoCache::Get( | ||
| const std::shared_ptr<CacheKey>& key, | ||
| std::function<std::shared_ptr<CacheValue>(const std::shared_ptr<CacheKey>&)> supplier) { | ||
| return supplier(key); | ||
| } | ||
|
|
||
| void NoCache::Put(const std::shared_ptr<CacheKey>& key, const std::shared_ptr<CacheValue>& value) { | ||
| // do nothing | ||
| } | ||
|
|
||
| void NoCache::Invalidate(const std::shared_ptr<CacheKey>& key) { | ||
| // do nothing | ||
| } | ||
|
|
||
| void NoCache::InvalidateAll() { | ||
| // do nothing | ||
| } | ||
|
|
||
| std::unordered_map<std::shared_ptr<CacheKey>, std::shared_ptr<CacheValue>> NoCache::AsMap() { | ||
| return {}; | ||
| } | ||
|
|
||
| } // namespace paimon | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,71 @@ | ||
| /* | ||
| * Copyright 2026-present Alibaba Inc. | ||
| * | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| #pragma once | ||
| #include <cstdint> | ||
| #include <functional> | ||
| #include <memory> | ||
| #include <string> | ||
|
|
||
| #include "paimon/common/io/cache/cache_key.h" | ||
| #include "paimon/common/memory/memory_segment.h" | ||
| #include "paimon/status.h" | ||
|
|
||
| namespace paimon { | ||
| class CacheValue; | ||
|
|
||
| class Cache { | ||
| public: | ||
| virtual ~Cache() = default; | ||
| virtual std::shared_ptr<CacheValue> Get( | ||
| const std::shared_ptr<CacheKey>& key, | ||
| std::function<std::shared_ptr<CacheValue>(const std::shared_ptr<CacheKey>&)> supplier) = 0; | ||
|
|
||
| virtual void Put(const std::shared_ptr<CacheKey>& key, | ||
| const std::shared_ptr<CacheValue>& value) = 0; | ||
|
|
||
ChaomingZhangCN marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| virtual void Invalidate(const std::shared_ptr<CacheKey>& key) = 0; | ||
|
|
||
| virtual void InvalidateAll() = 0; | ||
|
|
||
| virtual std::unordered_map<std::shared_ptr<CacheKey>, std::shared_ptr<CacheValue>> AsMap() = 0; | ||
lxy-9602 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| }; | ||
|
|
||
| class NoCache : public Cache { | ||
| public: | ||
| std::shared_ptr<CacheValue> Get( | ||
| const std::shared_ptr<CacheKey>& key, | ||
| std::function<std::shared_ptr<CacheValue>(const std::shared_ptr<CacheKey>&)> supplier) | ||
| override; | ||
| void Put(const std::shared_ptr<CacheKey>& key, | ||
| const std::shared_ptr<CacheValue>& value) override; | ||
| void Invalidate(const std::shared_ptr<CacheKey>& key) override; | ||
| void InvalidateAll() override; | ||
| std::unordered_map<std::shared_ptr<CacheKey>, std::shared_ptr<CacheValue>> AsMap() override; | ||
| }; | ||
|
|
||
| class CacheValue { | ||
| public: | ||
| explicit CacheValue(const std::shared_ptr<MemorySegment>& segment) : segment_(segment) {} | ||
|
|
||
lxy-9602 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| std::shared_ptr<MemorySegment> GetSegment() { | ||
| return segment_; | ||
| } | ||
|
|
||
| private: | ||
| std::shared_ptr<MemorySegment> segment_; | ||
| }; | ||
| } // namespace paimon | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,53 @@ | ||
| /* | ||
| * Copyright 2026-present Alibaba Inc. | ||
| * | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| #include "paimon/common/io/cache/cache_key.h" | ||
|
|
||
| namespace paimon { | ||
|
|
||
| std::shared_ptr<CacheKey> CacheKey::ForPosition(const std::string& file_path, int64_t position, | ||
| int32_t length, bool is_index) { | ||
| return std::make_shared<PositionCacheKey>(file_path, position, length, is_index); | ||
| } | ||
|
|
||
| bool PositionCacheKey::IsIndex() { | ||
| return is_index_; | ||
| } | ||
|
|
||
| int64_t PositionCacheKey::Position() const { | ||
| return position_; | ||
| } | ||
|
|
||
| int32_t PositionCacheKey::Length() const { | ||
| return length_; | ||
| } | ||
|
|
||
| bool PositionCacheKey::operator==(const PositionCacheKey& other) const { | ||
| return file_path_ == other.file_path_ && position_ == other.position_ && | ||
|
|
||
| length_ == other.length_ && is_index_ == other.is_index_; | ||
| } | ||
|
|
||
| size_t PositionCacheKey::HashCode() const { | ||
| size_t seed = 0; | ||
| seed ^= std::hash<std::string>{}(file_path_) + HASH_CONSTANT + (seed << 6) + (seed >> 2); | ||
| seed ^= std::hash<int64_t>{}(position_) + HASH_CONSTANT + (seed << 6) + (seed >> 2); | ||
| seed ^= std::hash<int32_t>{}(length_) + HASH_CONSTANT + (seed << 6) + (seed >> 2); | ||
| seed ^= std::hash<bool>{}(is_index_) + HASH_CONSTANT + (seed << 6) + (seed >> 2); | ||
| return seed; | ||
| } | ||
|
|
||
| } // namespace paimon |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,68 @@ | ||
| /* | ||
| * Copyright 2026-present Alibaba Inc. | ||
| * | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| #pragma once | ||
| #include <cstdint> | ||
| #include <functional> | ||
| #include <memory> | ||
| #include <string> | ||
|
|
||
| #include "paimon/status.h" | ||
|
|
||
| namespace paimon { | ||
|
|
||
| class CacheKey { | ||
| public: | ||
| static std::shared_ptr<CacheKey> ForPosition(const std::string& file_path, int64_t position, | ||
| int32_t length, bool is_index); | ||
|
|
||
| public: | ||
| virtual ~CacheKey() = default; | ||
|
|
||
| virtual bool IsIndex() = 0; | ||
| }; | ||
|
|
||
| class PositionCacheKey : public CacheKey { | ||
| public: | ||
| PositionCacheKey(const std::string& file_path, int64_t position, int32_t length, bool is_index) | ||
| : file_path_(file_path), position_(position), length_(length), is_index_(is_index) {} | ||
|
|
||
| bool IsIndex() override; | ||
|
|
||
| int64_t Position() const; | ||
| int32_t Length() const; | ||
|
|
||
| bool operator==(const PositionCacheKey& other) const; | ||
| size_t HashCode() const; | ||
|
|
||
| private: | ||
| static constexpr uint64_t HASH_CONSTANT = 0x9e3779b97f4a7c15ULL; | ||
|
|
||
| const std::string file_path_; | ||
| const int64_t position_; | ||
| const int32_t length_; | ||
| const bool is_index_; | ||
| }; | ||
| } // namespace paimon | ||
|
|
||
| namespace std { | ||
| template <> | ||
| struct hash<paimon::PositionCacheKey> { | ||
| size_t operator()(const paimon::PositionCacheKey& key) const { | ||
| return key.HashCode(); | ||
| } | ||
| }; | ||
| } // namespace std |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,45 @@ | ||
| /* | ||
| * Copyright 2026-present Alibaba Inc. | ||
| * | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| #include "paimon/common/io/cache/cache_manager.h" | ||
|
|
||
| namespace paimon { | ||
|
|
||
| std::shared_ptr<MemorySegment> CacheManager::GetPage( | ||
| std::shared_ptr<CacheKey>& key, | ||
| std::function<Result<MemorySegment>(const std::shared_ptr<CacheKey>&)> reader) { | ||
| auto& cache = key->IsIndex() ? index_cache_ : data_cache_; | ||
| auto supplier = [=](const std::shared_ptr<CacheKey>& k) -> std::shared_ptr<CacheValue> { | ||
| auto ret = reader(k); | ||
| if (!ret.ok()) { | ||
| return nullptr; | ||
| } | ||
| auto segment = ret.value(); | ||
| auto ptr = std::make_shared<MemorySegment>(segment); | ||
| return std::make_shared<CacheValue>(ptr); | ||
| }; | ||
| return cache->Get(key, supplier)->GetSegment(); | ||
| } | ||
|
|
||
| void CacheManager::InvalidPage(std::shared_ptr<CacheKey>& key) { | ||
| if (key->IsIndex()) { | ||
| index_cache_->Invalidate(key); | ||
| } else { | ||
| data_cache_->Invalidate(key); | ||
| } | ||
| } | ||
|
|
||
| } // namespace paimon |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.