-
Notifications
You must be signed in to change notification settings - Fork 1.1k
chore(core): Refactor SmallString #6138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Vladislav Oleshko <[email protected]>
| tl.small_str_bytes += u_.small_str.Assign(encoded); | ||
| return; | ||
| } | ||
|
|
||
| if (taglen_ == SMALL_TAG && encoded.size() <= u_.small_str.size()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was the part with encoded.size() <= u_.small_str.size()
BorysTheDev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand a lot, ask somebody else to review too
src/core/small_string.cc
Outdated
| if (size_ == 0) { | ||
| // packed structs can not be tied here. | ||
| // If the allocation is large enough and space efficient, we can avoid allocating | ||
| if (s.size() >= size_ || s.size() * 2 < MallocUsed()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something is wrong with this condition, imho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it's just the inverse of the comment, I'll update it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I think the condition is wrong.
size_ = 20, malloc_used = 28, s.size() = 24 - you reallocate because of the first part. unnecessary. basically you only need to compare with MallocUsed imho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a todo, I can do it now. The code previously didn't allow any strings longer at all to be assigned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the SmallString class to enforce a critical invariant that was previously inconsistently applied: SmallString should only be used for strings with length strictly greater than kPrefLen (10 bytes). The refactoring simplifies code by removing redundant checks, fixes an inefficiency where growing a SmallString would unnecessarily switch to ROBJ_TAG allocation, and consolidates encoding/decoding logic in CompactObj::GetString().
Key changes:
- Enforced
size > kPrefLeninvariant throughoutSmallStringwith a DCHECK inAssign()and removed all defensivesize <= kPrefLenchecks - Fixed
EncodeString()to properly reuseSmallStringallocation when growing, instead of falling back toROBJ_TAG - Refactored
GetString()to unify encoding logic for bothROBJ_TAGandSMALL_TAGcases
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/core/small_string.h | Updated API: renamed GetV() to Get() with new overloads, changed size() return type to size_t, removed Reset() method, updated documentation to clarify usage for strings > 10 bytes |
| src/core/small_string.cc | Simplified Assign() reallocation logic, streamlined Free()/MallocUsed()/Equal()/Get() methods by removing size <= kPrefLen checks, refactored Get() with new overloads, added CanAllocate check in DefragIfNeeded() |
| src/core/compact_object.cc | Fixed EncodeString() to reuse SMALL_TAG when growing instead of switching to ROBJ_TAG, consolidated encoding/decoding logic in GetString() to handle both ROBJ_TAG and SMALL_TAG uniformly, removed kUseSmallStrings constant |
src/core/small_string.cc
Outdated
|
|
||
| if (size_ == 0) { | ||
| // packed structs can not be tied here. | ||
| // If the allocation is large enough and space efficient, we can avoid allocating |
Copilot
AI
Dec 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The comment placement is confusing. It describes when we avoid reallocating (the else branch), but is placed before the if statement. Consider moving it above line 69 (before the else block) or rephrasing to describe the if condition: "Reallocate if growing or if space-inefficient (new size < half of allocated space)"
| // If the allocation is large enough and space efficient, we can avoid allocating | |
| // Reallocate if growing or if space-inefficient (new size < half of allocated space) |
|
I use mi_good_size now to check if we need to grow. This is a logical change in this PR |
Relevant to #6137
CompactObj::GetString()part with encodings