Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speedup POE::Filter::Line by reducing usage of backtracking regex #19

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

apocalypse
Copy link
Collaborator

Hello,

I was playing with the SSLify testsuite in an attempt to improve it and stumbled on something totally
unexpected! The t/simple_parallel_superbig.t test sends a ~3MB packet via Server/Client::TCP and Filter::Line. I rushed out SSLify with that test but I did notice that it took a while to run (~180s on my beefy box!) and it made me feel that something was fishy.

So I brought out the big guns and ran NYTProf on the test.

apoc@box:~/eclipse_ws/perl-poe-sslify$ PERL_MM_USE_DEFAULT=1 AUTOMATED_TESTING=1 NYTPROF=blocks=1:calls=2 perl -d:NYTProf t/simple_superbig.t

Here's the NYTProf output:

http://htmlpreview.github.io/?https://github.com/apocalypse/perl-poe-sslify/blob/speedup_sslify/nytprof_LINE/index.html

Thanks to NYTProf, the slowdown was somewhere in POE::Filter::Line. I immediately jumped to the
conclusion that POE::Filter::Line was not really suited for my purposes and decided to re-run the test with
POE::Filter::Block utilizing BlockSize => length($str). The results was surprising!

http://htmlpreview.github.io/?https://github.com/apocalypse/perl-poe-sslify/blob/speedup_sslify/nytprof_BLOCK/index.html

For starters, the runtime went from 183s to 3s in my crude testing! The difference was just too great
for me to let Filter::Line lose to Filter::Block! I dug deeper into the source for Filter::Line and based
on the NYTProf diagnostics it was THIS line taking up 180s out of the total runtime of 183s which was crazy. Line 155 is:

unless $self->[FRAMING_BUFFER] =~ s/^(.*?)$self->[INPUT_REGEXP]//s;

You'll note that it's a super-backtracking regex that when confronted with a 3MB string will take
exponential time trying to match the INPUT_REGEXP! I remember hitting weird regex bugs in the past which looked like this and then the answer dawned on me. The solution is to simply avoid doing the costly backtracking unless necessary. Here's the oneliner patch :)

Insert before lines 154/155

last LINE unless $self->[FRAMING_BUFFER] =~ $self->[INPUT_REGEXP];

The runtime went from 183s to 11.5s which is a speed-up of almost 16x! It's not reaching the 3s that
Filter::Block was able to achieve, but I doubt I can speed it up more without calling in the regex wizards :)

http://htmlpreview.github.io/?https://github.com/apocalypse/perl-poe-sslify/blob/speedup_sslify/nytprof_LINE_FIXED/index.html

With grep.cpan.me I can see ~130 dists using POE::Filter::Line so I hope this will have a meaningful
impact for some of them. Thanks again for looking into this :)

http://grep.cpan.me/?q=POE%3A%3AFilter%3A%3ALine

@rcaputo
Copy link
Owner

rcaputo commented Nov 14, 2014

I'll have to look at this in more depth, probably next week. Something clever might be possible with substr() and pos() after the match you've added, to eliminate the substitution entirely.

@apocalypse
Copy link
Collaborator Author

I thought about this more and tweaked it a bit :)

The issue here is that I can't entirely avoid the regex because it might be needed when using InputLiteral or InputRegexp another than the default. I think this collection of patches is the best I can do, please let me know if you have more feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants