Option to represent scenarios as Reactor Mono/Flux, and execute concurrently in a single thread #2483

segevmalool · 2022-02-11T20:06:08Z

Is your feature request related to a problem?

Cucumber is often used to test remote web services, which involves many http requests. This seems like a great opportunity for integration with a reactive programming framework like Project Reactor or RxJava.

Describe the solution you'd like

When writing step definitions, I want to return Mono or Flux, which I expect to be serially concatenated. I want scenarios to run concurrently. This would speed up my tests a lot (assuming there is some reactive http client being used).

Describe alternatives you've considered

The current solution in cucumber-junit-platform-engine involves process-based parallelism (I think), which isn't as good for http requests with relatively high latency.

Is anyone aware of existing projects for this?

mpkorstanje · 2022-02-11T21:09:13Z

While it would certainly be interesting to do this, it would also result in a rewrite of Cucumber's core logic. Currently I don't think this is feasible on the short to medium term.

However once project loom is finished I expect the JUnit Platform will provide support for light weight threads and Cucumber will be able to incorporate it with relative ease.

While different from Monos and Fluxes I expect the results to be comparable.

segevmalool · 2022-02-12T00:06:09Z

Thanks for the response @mpkorstanje !

I'd like to get an idea of what would be required to create a reactive test execution, in case I want to work on it as a side project. I love the reactive interface!

Some things I noticed from digging:

CucumberExecutionContext must be instantiated with EventBus, ExitStatus, and RunnerSupplier. code

1.1 EventBus interface could be implemented with reactor lib. code

1.2 ExitStatus seems straightforward code

1.3 RunnerSupplier seems pretty simple code

1.3.1 Runner must be instantiated with EventBus, Backend list, ObjectFactory, and Options

1.3.1.1 Backend could be implemented code

1.3.1.2 ObjectFactory could be implemented (or reused?) code

1.3.1.3 Options could be implemented code

The junit platform engine refers to the CucumberTestEngine by name code

Where are some risks with this? Is there a lot more to it under the hood?

mpkorstanje · 2022-02-13T18:47:13Z

If you're doing this as a proof of concept then I don't see any problems you won't be able to work around.

You may have to deal with the fact that Cucumber doesn't have a compete set of step definitions until after a scenario has started (because of cucumber-java8). So it wouldn't be possible to concatenate all steps and hooks statically. But this is something you could ignore.

However we also have backwards compatibility to keep in mind. Currently all scenarios on the same thread are executed in order. When using reactive programming this assumption doesn't hold.

For example anything using a thread local to track the current scenario would break. Hence I think the best long term solution is waiting for project loom to land in the JVM.

burnscr · 2022-02-27T01:31:32Z

Recently I implemented a working proof of concept related to this over the course of a week. I'll add in my two cents in case others find it useful.

I primarily use Cucumber within a Kotlin environment. Kotlin has a beautiful library called kotlinx.coroutines which provides support for creating, launching, managing the lifecycles of lightweight concurrent coroutines. I decided to see how feasible it would be to rewrite Cucumber's core executor to run scenarios concurrently within the same thread as well as to support step functions with suspending logic (i.e., nonblocking HTTP requests).

Modifications

As @mpkorstanje kindly pointed out, much of Cucumber's core logic is written around the assumption that scenarios will always run one at a time within the context of an executing thread. Because of this, about 30 classes had to be modified to support invoking suspending step functions, concurrent access to supplier instances, proper event reporting for TeamCity, and maintaining state of individual scenarios. Since I was adding support for a Kotlin library, I made most of these changes in Kotlin (as you'll see in the snippets below)

I apologize in advance for any cringe-worthy design decisions you see below.
This was just a quick-and-dirty proof of concept. ¯\(ツ)/¯

Executor Service

Currently, Cucumber submits all scenarios it wants to run into an executor service within io.cucumber.core.runtime.Runtime#runFeatures(). This is where I created a coroutine scope using the supplied executor service and launched the scenarios (or pickles as they are called internally).

withContext(executor.asCoroutineDispatcher()) {
  for (pickle in picklesToBeRun) {
    launch {
      try {
        executePickle(pickle)
      } catch (...) {
        ...
      }
    }
  }
}

From this point on, all calls branching from executePickle are concurrent and evenly distributed across the threads defined within the executor service. In order to propagate the coroutine context of each scenario to any suspending step functions it has, most functions used to actually invoke the steps such as Runtime#executePickle, Runner#runPickle, TestCase#run, (etc) also needed to be converted into suspending functions.

Factory Suppliers

When Cucumber builds the Runtime instance, it provides it with several Supplier classes such as ObjectFactorySupplier and RunnerSupplier to provide each running scenario or thread with its own fancy instance of something. When running in a multithreaded context, these suppliers typically differentiate access by referencing the current thread.

This was no longer adequate since coroutines are not locked to the thread they were started in and multiple can be running on the same thread at a time. Because of this, the Suppliers needed to be modified to track the identity of the running coroutine instead of the current thread. For testing, I came up with the following crude extension function to obtain a unique identifier for each launched coroutine. The Suppliers were then modified to use this in place of thread ids.

fun CoroutineContext.getIdentifier(): String =
    this.job.toString().split("@", limit = 2).last()

I'm sure there is a better way to differentiate between coroutines.

Invoking Suspending Steps

When a Kotlin function is flagged as suspending, the JVM appends a Continuation argument to the function's signature. When invoking these functions, a Continuation instance also needs to be provided. I chose to handle this within the io.cucumber.core.runner.PickleStepDefinitionMatch#runStep method just before it checks to see if the correct number of arguments was provided.

fun Type.isContinuation(): Boolean =
    this.rawType == Continuation::class.java

override suspend fun runStep(state: TestCaseState) = suspendCoroutine<Unit> { continuation ->
  val parameterInfos = stepDefinition.parameterInfos()
  
  // attach continuation if method is a suspending function
  if (parameterInfos != null
      && ((arguments.size + 1) == parameterInfos.size)
      && parameterInfos.last().type.isContinuation()) {
    arguments.add(Argument { continuation })
  } else {
    // avoid suspending forever if this is a non-suspending step function
    continuation.resume(Unit)
  }
  
  ...
}

It is also worth noting that io.cucumber.core.stepexpression.StepExpressionFactory#createExpression also needed to be modified to check for the added Continuation argument to properly reference the datatype required for DataTable transformations.

Event Handling

This was one of the more difficult challenges I faced. Cucumber uses the TeamCityPlugin class to listen for important events and display pass/fail results. To work properly, it relies on receiving events in canonical order one scenario at a time. In order to achieve this, I created a new AbstractEventPublisher to receive events, group them by relevant coroutine, and dispatch them correctly once the TestRunFinished event was received.

To identify which coroutine TestCaseEvents were associated with, I added a new required identifier field to the abstract TestCaseEvent class and all of its descendants. All functions that emit these event types were also modified to provide an identifier similarly to what was discussed within the Factory Suppliers section.

Results

Once everything was working properly, the results were outstanding. As a stress test, I created a testing environment with multiple feature files, each with Before/After steps, Backgrounds, Scenarios with multiple passing and failing steps, custom DataTable converters, and Scenario Outlines with hundreds of entries. In total there were over 1000 scenarios each with steps that take anywhere from 1 to 20 seconds to run. Normally something like this takes several hours to run. However, since the tests ran concurrently, it ends up finishing in just under one minute.

Closing Remarks

I learned a lot from this. While I've only scratched the surface of Cucumber's implementation, I want to applaud everyone who has contributed to its framework.

For my specific use case, the performance gains I saw by making these changes was outstanding. What normally would take hours can now finish in minutes without dramatically increasing resource consumption. That being said, Kotlin coroutines would, as their name suggests, only benefit those writing their tests in Kotlin. It would take a substantial amount of refactoring which mainly benefits a subset of Cucumber users as well as breaking backwards compatibility with existing plugins.

Despite this, I would love to see the concept of same-thread concurrent execution discussed further as the performance gained is substantial.

Virtual cookie to whomever took the time to read this: 🍪

segevmalool · 2022-02-27T02:37:19Z

Thanks @burnscr ! I wonder if a separate project might be able to repurpose the concepts of gherkin and stepdefinitions from cucumber with a totally different test execution system.

mpkorstanje · 2022-03-08T20:18:04Z

Cheers. I'm closing this in favor of waiting for project Loom.

With respect to performance gains I reckon that these come from three places:

Glue code not blocking a thread.
Glue code and cucumber expressions only being discovered and compiled once (rather then per thread/scenario Single step expression is compiled into regex more than once (thus it makes test execution slower) #2035)
No contention around plugin IO (Pretty print plugin performance issues #2481)

I think we'll be able the catch the last two of these without a complete rewrite and maybe even before project Loom is done.

mpkorstanje closed this as completed Mar 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option to represent scenarios as Reactor Mono/Flux, and execute concurrently in a single thread #2483

Option to represent scenarios as Reactor Mono/Flux, and execute concurrently in a single thread #2483

segevmalool commented Feb 11, 2022 •

edited

Loading

mpkorstanje commented Feb 11, 2022 •

edited

Loading

segevmalool commented Feb 12, 2022 •

edited

Loading

mpkorstanje commented Feb 13, 2022 •

edited

Loading

burnscr commented Feb 27, 2022

segevmalool commented Feb 27, 2022

mpkorstanje commented Mar 8, 2022 •

edited

Loading

Option to represent scenarios as Reactor Mono/Flux, and execute concurrently in a single thread #2483

Option to represent scenarios as Reactor Mono/Flux, and execute concurrently in a single thread #2483

Comments

segevmalool commented Feb 11, 2022 • edited Loading

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

mpkorstanje commented Feb 11, 2022 • edited Loading

segevmalool commented Feb 12, 2022 • edited Loading

mpkorstanje commented Feb 13, 2022 • edited Loading

burnscr commented Feb 27, 2022

Modifications

Executor Service

Factory Suppliers

Invoking Suspending Steps

Event Handling

Results

Closing Remarks

segevmalool commented Feb 27, 2022

mpkorstanje commented Mar 8, 2022 • edited Loading

segevmalool commented Feb 11, 2022 •

edited

Loading

mpkorstanje commented Feb 11, 2022 •

edited

Loading

segevmalool commented Feb 12, 2022 •

edited

Loading

mpkorstanje commented Feb 13, 2022 •

edited

Loading

mpkorstanje commented Mar 8, 2022 •

edited

Loading