sort:Optimize sort collation for long lines#12144
sort:Optimize sort collation for long lines#12144mattsu2020 wants to merge 2 commits intouutils:mainfrom
Conversation
|
GNU testsuite comparison: |
Merging this PR will degrade performance by 23.24%
Performance Changes
Comparing Footnotes
|
|
Out of interest, why choose 1 MiB as the limit, rather than something lower like |
Since measurements using 64 KiB showed performance that was at least equivalent for the issue workload, we will change the threshold to u16::MAX. |
|
@mattsu2020 Could you also add a benchmark (in separate PR)? |
Sure, I’ll keep this PR focused on the fix and open a separate PR adding a benchmark for long-line locale collation. |
What changed
Why
Fixes #12138. In UTF-8 locales,
sortprecomputed ICU collation keys for every input line. For inputs with a small number of very large lines, such as 26 lines of 200 MiB each, the cost of generating and storing multi-GiB collation keys dominated runtime.Impact
Small and normal-sized lines keep the existing precomputed-key fast path. Very long lines skip the expensive key materialization and use
locale_cmpwhen compared.Validation
cargo check -p uu_sortcargo test -p uu_sortcargo test -p coreutils --test tests test_sort::test_default_unsorted_ints -- --exactcmpfor 52 MiB and 130 MiB reproducer inputs.LC_ALL=en_US.UTF-8 --parallel 1 --buffer-size 8G: