-
-
Notifications
You must be signed in to change notification settings - Fork 14.9k
std::slice::partition_point performance regression when using rustc 1.82+ #138796
Copy link
Copy link
Closed
Labels
C-bugCategory: This is a bug.Category: This is a bug.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.S-has-bisectionStatus: A bisection has been found for this issueStatus: A bisection has been found for this issueT-libsRelevant to the library team, which will review and decide on the PR/issue.Relevant to the library team, which will review and decide on the PR/issue.
Metadata
Metadata
Assignees
Labels
C-bugCategory: This is a bug.Category: This is a bug.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.S-has-bisectionStatus: A bisection has been found for this issueStatus: A bisection has been found for this issueT-libsRelevant to the library team, which will review and decide on the PR/issue.Relevant to the library team, which will review and decide on the PR/issue.
Type
Fields
Give feedbackNo fields configured for issues without a type.
In https://github.com/fjall-rs/lsm-tree I am heavily using
partition_pointfor various binary searches.I found the std implementation to be somewhat slow.
As you can see below, the performance regression happens between rustc 1.81 and 1.82.
Comparing std between rustc versions:
The std implementation performs much worse in rustc 1.82 compared to 1.81.
As a sanity check, I wrote a simple alternative and found it performed much better (comparable to how
partition_pointused to be):https://github.com/fjall-rs/lsm-tree/blob/0dca39eb54690beee2481cc3e31c9fc78d9e3fca/src/binary_search.rs#L9-L35
Reproduce (using monkey patch):
The hand-written binary search performs much better in 1.82+.
Looking at flame graphs, you can see a whole chunk of
PartialOrdthat only occurs when using the std implementation in 1.85: