Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

str.count("\n") is 1.3-170 times faster than str.lines.count or str.each_line.count depending on the string size #220

Open
ilyazub opened this issue Sep 11, 2023 · 6 comments

Comments

@ilyazub
Copy link

ilyazub commented Sep 11, 2023

str.count("\n") is 1.3-170 times faster than str.lines.count or str.each_line.count (ref: https://serpapi.com/blog/lines-count-failed-deployments/). The speed difference grows with the lines count.

$ ruby tmp/string_count_benchmark.rb
Warming up --------------------------------------
  String#count('\n')    86.000  i/100ms
   String#lines.size     1.000  i/100ms
  String#lines.count     1.000  i/100ms
String#each_line.count
                         1.000  i/100ms
Calculating -------------------------------------
  String#count('\n')    771.031  (± 6.6%) i/s -      3.870k in   5.041849s
   String#lines.size      4.785  (± 0.0%) i/s -     24.000  in   5.037242s
  String#lines.count      4.513  (± 0.0%) i/s -     23.000  in   5.112095s
String#each_line.count
                          4.763  (± 0.0%) i/s -     24.000  in   5.075882s

Comparison:
  String#count('\n'):      771.0 i/s
   String#lines.size:        4.8 i/s - 161.12x  (± 0.00) slower
String#each_line.count:        4.8 i/s - 161.87x  (± 0.00) slower
  String#lines.count:        4.5 i/s - 170.86x  (± 0.00) slower

Benchmark code:

require "benchmark/ips"

HTML = "\nruby\n" * 1024 * 1024

def fastest
  HTML.count("\n")
end

def faster
  HTML.each_line.count
end

def fast
  HTML.lines.length
end

def slow
  HTML.lines.size
end

Benchmark.ips do |x|
  x.report("String#count('\\n')")     { fastest }
  x.report("String#lines.size")       { faster  }
  x.report("String#lines.count")      { fast    }
  x.report("String#each_line.count")  { slow    }
  x.compare!
end

I'd like to add this benchmark to fast-ruby. Wdyt?


Based on our updates to the @guilhermesimoes' very helpful gist: https://gist.github.com/guilhermesimoes/d69e547884e556c3dc95?permalink_comment_id=4687645#gistcomment-4687645

@ilyazub
Copy link
Author

ilyazub commented Oct 26, 2023

@JuanVqz @etagwerker what do you think about a benchmark for String#count vs String#lines.count vs String#each_line.count?

@JuanVqz
Copy link
Member

JuanVqz commented Oct 26, 2023

This seems a Rails related benchmark, I wonder if we are adding framework-related benchmarks

@ilyazub
Copy link
Author

ilyazub commented Oct 27, 2023

It doesn't depend on Rails (ruby/ruby#4001 (comment)). I updated the benchmark code to work with the plain Ruby.

@ilyazub
Copy link
Author

ilyazub commented Oct 31, 2023

@JuanVqz I updated the benchmark code above to work with the plain Ruby.

What do you think?

@ixti
Copy link
Collaborator

ixti commented Nov 21, 2023

IMHO using $/ is bad - it's less obvious than "\n". More than that, $/ can be redefined, so the code may become broken.

@ilyazub ilyazub changed the title str.count($/) is 1.3x faster than str.lines.count or str.each_line.count str.count("\n") is 1.3x faster than str.lines.count or str.each_line.count Nov 21, 2023
@ilyazub
Copy link
Author

ilyazub commented Nov 21, 2023

@ixti Sounds good. The performance difference is in the String#count vs other methods. $/ vs \n did't impact performance here.

@ilyazub ilyazub changed the title str.count("\n") is 1.3x faster than str.lines.count or str.each_line.count str.count("\n") is 1.3-170 times faster than str.lines.count or str.each_line.count depending on the string size Nov 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants