Runtimes slow down dramatically proportionally with feature file size #331

tgsmith61591 · 2024-04-17T18:47:40Z

We maintain a large collection of feature files that feed nightly regression tests, the runtime of which has grown significantly recently. These are generally maintained in logically separated feature files, and leverage Scenario Outline tables, sometimes with 100-200 scenarios per feature file.

In experimenting with optimizations on a single tag, I tried splitting a feature file into 4 and observed a massive performance gain. Here is my baseline:

[Summary]
1 feature
257 scenarios (226 passed, 31 failed)
1663 steps (1632 passed, 31 failed)
Tests completed in 266 sec

Here are the same exact tests divided over 4 feature files:

[Summary]
4 features
257 scenarios (226 passed, 31 failed)
1663 steps (1632 passed, 31 failed)
Tests completed in 99 sec

For reference, here is how we're running (note that local_test_jobs keeps Bazel from trying to do its own parallelism, instead delegating the concurrency to the cucumber engine):

$ bazel run //my-crate:my-test-suite \
    --local_test_jobs=1 -- \
    --concurrency 16 \
    --tags="@my-cool-tag"

Several questions I have after observing this major performance difference:

How is concurrency actually affecting the runtime? I was under the impression concurrency was at the scenario level, but now I'm wondering whether it's actually at the feature level
Is there any guidance you can give on tuning concurrency to the number of feature files?
Is there any further guidance on how tests should be broken up to get the best performance out of the cucumber-rs engine?

The text was updated successfully, but these errors were encountered:

tyranron · 2024-04-18T12:50:28Z

@tgsmith61591

How is concurrency actually affecting the runtime? I was under the impression concurrency was at the scenario level, but now I'm wondering whether it's actually at the feature level

concurrency is the maximum number of scenarios running concurrently (in async manner).

Is there any guidance you can give on tuning concurrency to the number of feature files?

Is there any further guidance on how tests should be broken up to get the best performance out of the cucumber-rs engine?

Actually, you shouldn't. The behavior you observe, that splitting source files leads to a significant performance gain, seems to be buggy, weird and unexpected. There shouldn't be any significant difference. Should be investigated and fixed.

The idea we have in mind for now is that a Parser returns a Stream of features being consumed by a Runner, executing them concurrently on scenario level. Seems like the current Runner implementation breaks up those features into scenarios in some weird manner, affecting the performance.

tyranron · 2024-04-18T12:53:34Z

@ilslv do you have any suggestions on this?

ilslv · 2024-04-21T20:30:51Z

@tgsmith61591 can you share a little bit more about the characteristics of the testing suite? Is it async or sync heavy, or maybe appropriately both? Can you share the World setup: number of concurrency and other options?

tgsmith61591 · 2024-04-23T19:15:44Z

Hey @ilslv and @tyranron, the test suite is very async heavy. While I cannot share the world setup (highly complex, and belongs to the company, not me) I can share a small repo I set up to reproduce this issue.

tl;dr

Execute 500 scenarios that sleep for 1 second
Example A puts 500 scenarios into a single file
Example B spreads 500 scenarios over 10 files
Example A takes 5:30 - 6 minutes at -c 32
Example B takes <20 seconds at -c 32 (!!!)

ilslv · 2024-04-23T19:19:36Z

Thank you for the reproduction repo! I'll definitely take a look hopefully this weekend.

tgsmith61591 · 2024-04-30T18:06:29Z

Hey @ilslv, any chance you got a chance to look at this?

ilslv · 2024-05-01T08:51:42Z

not yet, unfortunately 😢

tgsmith61591 · 2024-05-21T17:39:43Z

Any update on this issue @ilslv ?

flyingsilverfin · 2024-07-19T16:33:10Z

We might be able to take a look into this if you could provide any pointers @ilslv ? We have 1000 tests or more that currently eat up around 40 minutes instead of around 5 minutes!

dmitrii-ubskii · 2024-07-22T16:20:45Z

We've come up with a simple workaround using a wrapper around the basic parser that splits each scenario into its own gherkin feature:

#[derive(Debug, Default)]
struct SingletonParser {
    basic: cucumber::parser::Basic,
}

impl<I: AsRef<Path>> cucumber::Parser<I> for SingletonParser {
    type Cli = <cucumber::parser::Basic as cucumber::Parser<I>>::Cli;
    type Output = stream::FlatMap<
        stream::Iter<std::vec::IntoIter<Result<Feature, cucumber::parser::Error>>>,
        Either<
            stream::Iter<std::vec::IntoIter<Result<Feature, cucumber::parser::Error>>>,
            stream::Iter<iter::Once<Result<Feature, cucumber::parser::Error>>>,
        >,
        fn(
            Result<Feature, cucumber::parser::Error>,
        ) -> Either<
            stream::Iter<std::vec::IntoIter<Result<Feature, cucumber::parser::Error>>>,
            stream::Iter<iter::Once<Result<Feature, cucumber::parser::Error>>>,
        >,
    >;

    fn parse(self, input: I, cli: Self::Cli) -> Self::Output {
        self.basic.parse(input, cli).flat_map(|res| match res {
            Ok(mut feature) => {
                let scenarios = mem::take(&mut feature.scenarios);
                let singleton_features = scenarios
                    .into_iter()
                    .map(|scenario| {
                        Ok(Feature {
                            name: feature.name.clone() + " :: " + &scenario.name,
                            scenarios: vec![scenario],
                            ..feature.clone()
                        })
                    })
                    .collect_vec();
                Either::Left(stream::iter(singleton_features))
            }
            Err(err) => Either::Right(stream::iter(iter::once(Err(err)))),
        })
    }
}

Before:

[Summary]
1 feature
702 scenarios (702 passed)
41957 steps (41957 passed)
test test ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2013.41s

After:

[Summary]
702 features
702 scenarios (702 passed)
41957 steps (41957 passed)
test test ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 165.86s

## Usage and product changes Workaround for cucumber-rs/cucumber#331 (cucumber-rs/cucumber#331). The wrapper creates a new feature for each scenario, which sidesteps the runner issue. On `//tests/behaviour/concept/type:test_owns_annotations`: Before: ``` [Summary] 1 feature 702 scenarios (702 passed) 41957 steps (41957 passed) test test ... ok test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2013.41s ``` After: ``` [Summary] 702 features 702 scenarios (702 passed) 41957 steps (41957 passed) test test ... ok test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 165.86s ```

tyranron added bug Something isn't working k::performance Related to performance of library question Further information is requested labels Apr 18, 2024

farost mentioned this issue Jul 19, 2024

Schema commit time validations typedb/typedb-behaviour#290

Merged

dmitrii-ubskii mentioned this issue Jul 22, 2024

Gherkin parser override: create feature per scenario typedb/typedb#7107

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtimes slow down dramatically proportionally with feature file size #331

Runtimes slow down dramatically proportionally with feature file size #331

tgsmith61591 commented Apr 17, 2024 •

edited

Loading

tyranron commented Apr 18, 2024

tyranron commented Apr 18, 2024

ilslv commented Apr 21, 2024

tgsmith61591 commented Apr 23, 2024

ilslv commented Apr 23, 2024

tgsmith61591 commented Apr 30, 2024

ilslv commented May 1, 2024

tgsmith61591 commented May 21, 2024

flyingsilverfin commented Jul 19, 2024

dmitrii-ubskii commented Jul 22, 2024

Runtimes slow down dramatically proportionally with feature file size #331

Runtimes slow down dramatically proportionally with feature file size #331

Comments

tgsmith61591 commented Apr 17, 2024 • edited Loading

tyranron commented Apr 18, 2024

tyranron commented Apr 18, 2024

ilslv commented Apr 21, 2024

tgsmith61591 commented Apr 23, 2024

tl;dr

ilslv commented Apr 23, 2024

tgsmith61591 commented Apr 30, 2024

ilslv commented May 1, 2024

tgsmith61591 commented May 21, 2024

flyingsilverfin commented Jul 19, 2024

dmitrii-ubskii commented Jul 22, 2024

tgsmith61591 commented Apr 17, 2024 •

edited

Loading