Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad CucumberExpression creation performance #200

Closed
jkronegg opened this issue Dec 27, 2022 · 6 comments · Fixed by #202
Closed

Bad CucumberExpression creation performance #200

jkronegg opened this issue Dec 27, 2022 · 6 comments · Fixed by #202
Assignees

Comments

@jkronegg
Copy link
Contributor

jkronegg commented Dec 27, 2022

👓 What did you see?

On my project with about 150 stepdefs and about 400 test scenarios, the IntelliJ profiler says the CucumberExpression.<init> method takes 25.9% of the total CPU time. This is because the method is called for all step defs and for all test scenarios. I think the performance could be better.

✅ What did you expect to see?

I expect CucumberExpression.<init> to avoid unnecessary processing (contributes to #2035).

I understand that cucumber-java8 can introduce dynamic behavior which requires parsing the expressions for each test scenario. However, I think we can safely cache everything that is constant and does not depend on cucumber-java8. I identitifed the following performance improvement points in CucumberExpression:

  • TreeRegex creation: in CucumberExpression constructor, this object serves to get some "metadata" about a regular expression itself (i.e. not depending on context). Thus, two identical regular expressions will lead to the same TreeRegp, so the creation is cacheable.

    The original code:

    this.treeRegexp = new TreeRegexp(pattern);
    

    could be replaced by (treeRegexps is a static Map<String, TreeRegexp>):

    this.treeRegexp = treeRegexps.computeIfAbsent(pattern, TreeRegexp::new);
    
  • calls to escapeRegex in the rewriteToRegex method are done on the Node.text() content: two identical Node.text() will lead to the same escaped result, independently of the context. Thus, the result of escapeRegex is cacheable.

    The original code:

    return escapeRegex(node.text());
    

    can be replaced by (escapedTexts is a static Map<String, String>):

    return escapedTexts.computeIfAbsent(node.text(), CucumberExpression::escapeRegex);
    

These two optimization points lead to four combinations to be benchmarked (original version is createExpression0). The benchmark consists in creating 400 times five different expressions:

Benchmark cached calls to escapeRegex cached TreeRegex creation ops/s
CucumberExpressionBenchmark.createExpression0 no no 153,024 ± 13,800
CucumberExpressionBenchmark.createExpression1 yes no 181,960 ± 12,133
CucumberExpressionBenchmark.createExpression2 no yes 186,236 ± 11,232
CucumberExpressionBenchmark.createExpression3 yes yes 219,890 ± 12,365

Caching the TreeRegex creation lead to 22% performance improvement and using both methods lead to 44% performance improvement.

On a real project with about 150 stepdefs and 400 test scenarios, the IntelliJ Profiler runs is about 7700 ms and says that CucumberExpression.<init> is:

  • 25.9% of the total CPU time with the original version (1994 ms)
  • 15.7% of the total CPU time with both optimizations enabled (1209 ms, i.e. that's a 785 ms improvement on total time, or 10%)

I suggest to use the variant createExpression3 and I would be happy to propose a PR.

📦 Which tool/library version are you using?

Cucumber 7.10.1

🔬 How could we reproduce it?

The benchmark with the four variants is in
cucumberexpressions.zip

Steps to reproduce the behavior:

  1. Create a Maven project with the following dependencies:

     <dependency>
         <groupId>io.cucumber</groupId>
         <artifactId>cucumber-java</artifactId>
         <version>${cucumber.version}</version>
         <scope>test</scope>
     </dependency>
     <dependency>
         <groupId>io.cucumber</groupId>
         <artifactId>cucumber-junit-platform-engine</artifactId>
         <version>${cucumber.version}</version>
         <scope>test</scope>
     </dependency>
     <dependency>
         <groupId>io.cucumber</groupId>
         <artifactId>cucumber-picocontainer</artifactId>
         <version>${cucumber.version}</version>
         <scope>test</scope>
     </dependency>
     <dependency>
         <groupId>org.openjdk.jmh</groupId>
         <artifactId>jmh-generator-annprocess</artifactId>
         <version>1.36</version>
         <scope>test</scope>
     </dependency>
    
  2. Run the benchmark

@mpkorstanje mpkorstanje self-assigned this Dec 27, 2022
@mpkorstanje
Copy link
Contributor

mpkorstanje commented Dec 27, 2022

I would welcome a pull request for this, but the cache(s) should not be static.

Consider extracting the code from the CucumberExpression constructor to a CucumberExpressionFactory. This factory could then be instantiated in the ExpressionFactory with a cache instance. This means that once the ExpressionFactory goes out of scope, and no more expressions are created, the cache will be garbage collected.

Note that the RegularExpression also create a TreeRegex. The same optimization may be useful there.

Please also update the documentation for the TreeRegex and GroupBuilder to note that they should be thread-safe.

@mpkorstanje
Copy link
Contributor

mpkorstanje commented Dec 27, 2022

Ah. Please disregard my previous comment.

The ExpressionFactory is instantiated with a ParameterTypeRegistry which contains scenario specific state. While we could share the cache inside a Runner, I need to think about this. At some point it becomes easier to remove cucumber-java8 and then refactor cucumber-core instead.

@jkronegg
Copy link
Contributor Author

I don't see the issue with ParameterTypeRegistry : the proposed solution only cache things that are "really constant" (i.e. that are not impacted by cucumber-java8).

I don't think the static cache will cause a real memory usage issue because on my project with ~150 stepdefs and ~400 test scenarios, the TreeRegex cache has only 117 elements and the escapedTexts cache has only 205 elements. However, if you want to replace the static caching, the ExpressionFactory could create a Runner which holds the TreeRegex cache. This Runner would be passed to RegularExpression and CucumberExpression constructors (small API change), with the advantage to have only one TreeRegex cache instead of two (one in RegularExpression and CucumberExpression). The Runner could hold the escapedTexts as well.

Caching the TreeRegex in RegularExpression makes the constructor 4 times faster (JMH benchmark: uncached=7'000 ops/s, cached=28'000 ops/s). On my project with 150 stepdefs and 400 test scenarios, this makes no impact because the RegularExpression.<init> was contributing to only 0.34% of the total CPU time. But this may be different with other projects.

Good point on thread-safety. TreeRegex looks thread-safe. GroupBuilder.add() method is not thread-safe because it calls ArrayList.add(), but I don't think this is an issue as GroupBuilder.add() is only called when the GroupBuilder is constructed in TreeRegex.createGroupBuilder(). Maybe the GroupBuilder could be modified to remove the add() method with a Builder pattern in order to ensure thread-safety (TreeRegex.createGroupBuilder() could be moved to this Builder for a better separation of concerns).

@mpkorstanje
Copy link
Contributor

mpkorstanje commented Dec 28, 2022

I don't see the issue with ParameterTypeRegistry : the proposed solution only cache things that are "really constant" (i.e. that are not impacted by cucumber-java8).

I was hoping to keep a non-static cache in the ExpressionFactory and then retain a reference to the factory in Cucumber-JVM. But this isn't feasible right now. While I'm not too worried about leaking memory this way, not having statics anywhere makes it much easier to reason about (concurrent) code.

And while this proposal does make steps towards cucumber/cucumber-jvm#2035 it would become mostly obsolete the moment we drop cucumber-java8 and build every cucumber expression exactly once. So I would rather spend time working towards improvement that don't become obsolete in the future.

Would it be possible to create an optimization for escapeRegex that doesn't depend on caching?

@jkronegg
Copy link
Contributor Author

I'll try to optimize escapeRegex without caching (I'm thinking about a custom String.replace(String,String) method, but this would require more effort and the code would be more complex than a simple regexp).

@jkronegg
Copy link
Contributor Author

jkronegg commented Dec 29, 2022

I've refactored the escapeRegex method with two different methods inspired by String.replace(String,String):

Method Description ops/s
escapeRegex0 The original method CucumberExpression.escapeRegex 1'200'000
escapeRegex1 Escaping character by character 8'300'000
escapeRegex2 Escaping characters by blocs 6'300'000

The escapeRegex1 is 7 times faster than the original version and is implemented as follows:

/**
 * List of characters to be escaped.
 * The last char is '}' with index 125, so we need only 126 characters.
 */
public static final boolean[] CHAR_TO_ESCAPE = new boolean[126];
static {
    CHAR_TO_ESCAPE['^'] = true;
    CHAR_TO_ESCAPE['$'] = true;
    CHAR_TO_ESCAPE['['] = true;
    CHAR_TO_ESCAPE[']'] = true;
    CHAR_TO_ESCAPE['('] = true;
    CHAR_TO_ESCAPE[')'] = true;
    CHAR_TO_ESCAPE['{'] = true;
    CHAR_TO_ESCAPE['}'] = true;
    CHAR_TO_ESCAPE['.'] = true;
    CHAR_TO_ESCAPE['|'] = true;
    CHAR_TO_ESCAPE['?'] = true;
    CHAR_TO_ESCAPE['*'] = true;
    CHAR_TO_ESCAPE['+'] = true;
    CHAR_TO_ESCAPE['\\'] = true;
}

public static String escapeRegex1(String text) {
    int length = text.length();
    StringBuilder sb = new StringBuilder(length);
    for (int i = 0; i < length; i++) {
        char currentChar = text.charAt(i);
        if (currentChar < CHAR_TO_ESCAPE.length && CHAR_TO_ESCAPE[currentChar]) {
            sb.append('\\');
        }
        sb.append(currentChar);
    }
    return sb.toString();
}

The code is pretty simple and elegant.

The escapeRegex1 is implemented as follows:

// CHAR_TO_ESCAPE ommited for brivety (same as above)

public static String escapeRegex2(String text) {
    int length = text.length();
    StringBuilder sb = new StringBuilder(length);
    int blocStart=0;
    for (int i = 0; i < length; i++) {
        char currentChar = text.charAt(i);
        if (currentChar < CHAR_TO_ESCAPE.length && CHAR_TO_ESCAPE[currentChar]) {
            if (i > blocStart) {
                // flush previous block
                sb.append(text, blocStart, i);
            }
            sb.append('\\');
            sb.append(currentChar);
            blocStart=i+1;
        }
    }
    if (length > blocStart) {
        // flush remaining characters
        sb.append(text, blocStart, length);
    }
    return sb.toString();
}

The code is a bit more complex and slower than escapeRegex1.

The full CucumberExpression benchmark is now (the ops/s cannot be compared with the ones from the original post because this is not on the same machine):

Benchmark cached calls to escapeRegex cached TreeRegex creation ops/s
CucumberExpressionBenchmark.createExpression0 no no 97
CucumberExpressionBenchmark.createExpression1 yes no 101
CucumberExpressionBenchmark.createExpression2 no yes 113
CucumberExpressionBenchmark.createExpression3 yes yes 119
CucumberExpressionBenchmark.createExpression4 no (optimized escapeRegex1) no 104

So the escapeRegex1 lead to a total 7% improvement over the curent implementation, in the same range as the variant with cached calls to escapeRegex. This is not significant right now (as we escape the same regexp several times), but may become significant the day we get rid of cucumber-java8.

Should I do a PR with only this optimization with escapeRegex1 (i.e. without caching the TreeRegex and without caching calls to escapeRegex) ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants