Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtimes slow down dramatically proportionally with feature file size #331

Open
tgsmith61591 opened this issue Apr 17, 2024 · 10 comments
Open
Labels
bug Something isn't working k::performance Related to performance of library question Further information is requested

Comments

@tgsmith61591
Copy link

tgsmith61591 commented Apr 17, 2024

We maintain a large collection of feature files that feed nightly regression tests, the runtime of which has grown significantly recently. These are generally maintained in logically separated feature files, and leverage Scenario Outline tables, sometimes with 100-200 scenarios per feature file.

In experimenting with optimizations on a single tag, I tried splitting a feature file into 4 and observed a massive performance gain. Here is my baseline:

[Summary]
1 feature
257 scenarios (226 passed, 31 failed)
1663 steps (1632 passed, 31 failed)
Tests completed in 266 sec

Here are the same exact tests divided over 4 feature files:

[Summary]
4 features
257 scenarios (226 passed, 31 failed)
1663 steps (1632 passed, 31 failed)
Tests completed in 99 sec

For reference, here is how we're running (note that local_test_jobs keeps Bazel from trying to do its own parallelism, instead delegating the concurrency to the cucumber engine):

$ bazel run //my-crate:my-test-suite \
    --local_test_jobs=1 -- \
    --concurrency 16 \
    --tags="@my-cool-tag"

Several questions I have after observing this major performance difference:

  • How is concurrency actually affecting the runtime? I was under the impression concurrency was at the scenario level, but now I'm wondering whether it's actually at the feature level
  • Is there any guidance you can give on tuning concurrency to the number of feature files?
  • Is there any further guidance on how tests should be broken up to get the best performance out of the cucumber-rs engine?
@tyranron tyranron added bug Something isn't working k::performance Related to performance of library question Further information is requested labels Apr 18, 2024
@tyranron
Copy link
Member

@tgsmith61591

  • How is concurrency actually affecting the runtime? I was under the impression concurrency was at the scenario level, but now I'm wondering whether it's actually at the feature level

concurrency is the maximum number of scenarios running concurrently (in async manner).

  • Is there any guidance you can give on tuning concurrency to the number of feature files?
  • Is there any further guidance on how tests should be broken up to get the best performance out of the cucumber-rs engine?

Actually, you shouldn't. The behavior you observe, that splitting source files leads to a significant performance gain, seems to be buggy, weird and unexpected. There shouldn't be any significant difference. Should be investigated and fixed.

The idea we have in mind for now is that a Parser returns a Stream of features being consumed by a Runner, executing them concurrently on scenario level. Seems like the current Runner implementation breaks up those features into scenarios in some weird manner, affecting the performance.

@tyranron
Copy link
Member

@ilslv do you have any suggestions on this?

@ilslv
Copy link
Member

ilslv commented Apr 21, 2024

@tgsmith61591 can you share a little bit more about the characteristics of the testing suite? Is it async or sync heavy, or maybe appropriately both? Can you share the World setup: number of concurrency and other options?

@tgsmith61591
Copy link
Author

Hey @ilslv and @tyranron, the test suite is very async heavy. While I cannot share the world setup (highly complex, and belongs to the company, not me) I can share a small repo I set up to reproduce this issue.

tl;dr

  • Execute 500 scenarios that sleep for 1 second
  • Example A puts 500 scenarios into a single file
  • Example B spreads 500 scenarios over 10 files
  • Example A takes 5:30 - 6 minutes at -c 32
  • Example B takes <20 seconds at -c 32 (!!!)

@ilslv
Copy link
Member

ilslv commented Apr 23, 2024

Thank you for the reproduction repo! I'll definitely take a look hopefully this weekend.

@tgsmith61591
Copy link
Author

Hey @ilslv, any chance you got a chance to look at this?

@ilslv
Copy link
Member

ilslv commented May 1, 2024

not yet, unfortunately 😢

@tgsmith61591
Copy link
Author

Any update on this issue @ilslv ?

@flyingsilverfin
Copy link

We might be able to take a look into this if you could provide any pointers @ilslv ? We have 1000 tests or more that currently eat up around 40 minutes instead of around 5 minutes!

@dmitrii-ubskii
Copy link

We've come up with a simple workaround using a wrapper around the basic parser that splits each scenario into its own gherkin feature:

#[derive(Debug, Default)]
struct SingletonParser {
    basic: cucumber::parser::Basic,
}

impl<I: AsRef<Path>> cucumber::Parser<I> for SingletonParser {
    type Cli = <cucumber::parser::Basic as cucumber::Parser<I>>::Cli;
    type Output = stream::FlatMap<
        stream::Iter<std::vec::IntoIter<Result<Feature, cucumber::parser::Error>>>,
        Either<
            stream::Iter<std::vec::IntoIter<Result<Feature, cucumber::parser::Error>>>,
            stream::Iter<iter::Once<Result<Feature, cucumber::parser::Error>>>,
        >,
        fn(
            Result<Feature, cucumber::parser::Error>,
        ) -> Either<
            stream::Iter<std::vec::IntoIter<Result<Feature, cucumber::parser::Error>>>,
            stream::Iter<iter::Once<Result<Feature, cucumber::parser::Error>>>,
        >,
    >;

    fn parse(self, input: I, cli: Self::Cli) -> Self::Output {
        self.basic.parse(input, cli).flat_map(|res| match res {
            Ok(mut feature) => {
                let scenarios = mem::take(&mut feature.scenarios);
                let singleton_features = scenarios
                    .into_iter()
                    .map(|scenario| {
                        Ok(Feature {
                            name: feature.name.clone() + " :: " + &scenario.name,
                            scenarios: vec![scenario],
                            ..feature.clone()
                        })
                    })
                    .collect_vec();
                Either::Left(stream::iter(singleton_features))
            }
            Err(err) => Either::Right(stream::iter(iter::once(Err(err)))),
        })
    }
}

Before:

[Summary]
1 feature
702 scenarios (702 passed)
41957 steps (41957 passed)
test test ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2013.41s

After:

[Summary]
702 features
702 scenarios (702 passed)
41957 steps (41957 passed)
test test ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 165.86s

dmitrii-ubskii added a commit to typedb/typedb that referenced this issue Jul 22, 2024
## Usage and product changes

Workaround for cucumber-rs/cucumber#331
(cucumber-rs/cucumber#331). The wrapper
creates a new feature for each scenario, which sidesteps the runner
issue.

On `//tests/behaviour/concept/type:test_owns_annotations`:

Before:
```
[Summary]
1 feature
702 scenarios (702 passed)
41957 steps (41957 passed)
test test ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2013.41s
```
After:
```
[Summary]
702 features
702 scenarios (702 passed)
41957 steps (41957 passed)
test test ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 165.86s
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working k::performance Related to performance of library question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants