-
Notifications
You must be signed in to change notification settings - Fork 379
Open
Labels
Description
Per the Ruby 2.4.0 docs:
The current implementation of sort_by generates an array of tuples containing the original collection element and the mapped value. This makes sort_by fairly expensive when the keysets are simple.
Here's a concrete example showing sort
to be 2.70x faster in Ruby 1.9.3, 2.3.3 and 2.4.0:
require 'benchmark/ips'
Benchmark.ips do |x|
x.time = 5
x.warmup = 2
ARRAY = %w{apple pear fig}
x.report("sort_by") do
ARRAY.sort_by(&:length)
end
x.report("sort") do
ARRAY.sort { |a, b| a.length <=> b.length}
end
x.compare!
end
Warming up --------------------------------------
sort_by 56.348k i/100ms
sort 111.946k i/100ms
Calculating -------------------------------------
sort_by 635.646k (±16.1%) i/s - 3.099M in 5.074036s
sort 1.713M (±16.3%) i/s - 8.284M in 5.023904s
Comparison:
sort: 1713232.1 i/s
sort_by: 635645.9 i/s - 2.70x slower
Metadata
Metadata
Assignees
Labels
Type
Projects
Milestone
Relationships
Development
Select code repository
Activity
mblumtritt commentedon Jan 24, 2017
In your sample you sort the
ARRAY
twice - it's already sorted when the second sample runs. The results may be therefore wrong…monfresh commentedon Jan 24, 2017
I don't follow. ARRAY is not mutated. If you call
ARRAY.sort_by(&:length)
, and then callARRAY
again, you get the original ARRAY. You can verify this by adding.freeze
to the end of the ARRAY assignment (ARRAY = %w(apple pear fig).freeze
)I get the same results when I change the order of the samples:
monfresh commentedon Jan 24, 2017
The code example in this repo also does the same thing: it defines the array to be used in the samples as a constant: https://github.com/JuanitoFatas/fast-ruby/blob/master/code/enumerable/sort-vs-sort_by.rb
avellable commentedon Jan 25, 2017
@monfresh Thanks for pointing this out. There are two ways to resolve into the current example,
@JuanitoFatas Any comments about above?
mblumtritt commentedon Jan 25, 2017
@monfresh Sorry - I used in my local sample
sort!
andsort_by!
and did ignore your code details. Sorry …abuisman commentedon Aug 15, 2019
I just ran into a situation where
sort_by
is not faster, per se, in fact 2.95x slower. But it has something to do with the number of items in the array.Yields
This is with 2 items in the result. When I bump this to 100:
Apparently there is a correlation between the number of items and the performance difference, which makes sense of course. Adding items to the array brings us closer to the 'true' performance difference between
sort
andsort_by
.So lets bump to 1000:
So at 1000 items
sort_by
becomes faster.I don't know how we'd go about this, but adding a note about the expected number of items in the array would make sense.
luxangel777 commentedon Aug 22, 2020
🙏