Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance when running a large number of JUnit tests #172

Open
swarren12 opened this issue Feb 13, 2024 · 1 comment
Open

Improve performance when running a large number of JUnit tests #172

swarren12 opened this issue Feb 13, 2024 · 1 comment
Labels
P2 We'll consider to work on this in future. (Assignee optional)

Comments

@swarren12
Copy link

swarren12 commented Feb 13, 2024

Title is a bit vague here, apologies for that.

Out-of-the-box, it is only possible to provide a single entry point to java_test, e.g.

java_test(
    name = "com.example.MyLovelyUnitTest",
    test_class =  "com.example.MyLovelyUnitTest",
    srcs = [ ... ],
    # etc etc
)

If one wishes to make a target from multiple classes, there are currently two well publicised workarounds:

  1. Use a macro wrapper;
  2. Use a @Suite or similar.

Unfortunately, each of these comes with a negative side effect wrt. performance:

  1. Using a macro wrapper to turn the glob of source files into distinct java_test targets is documented to have a significant performance impact due to having to create and tear-down workers. To give a rough idea of the impact of this, here are some numbers taken from a Bazel project with ~5000 unit tests using a @Suite:
$ bazel clean
$ time bazel test //... --build_tests_only --test_lang_filters=java --test_size_filters=small
...
Executed 498 out of 498 tests: 498 tests pass.
...
bazel test //... --build_tests_only --test_lang_filters=java   2.58s user 1.55s system 0% cpu 16:41.63 total

... vs using a macro wrapper + aggregate test_suite:

$ bazel clean
$ time bazel test //... --build_tests_only --test_lang_filters=java  --test_size_filters=small
...
Executed 4691 out of 4691 tests: 4691 tests pass.
...
./bb bazel bz test //... --build_tests_only --test_lang_filters=java   2.72s user 1.69s system 0% cpu 26:47.95 total

Both of these runs were operating over the same set of tests, but using a separate java_test for each individual class causes the build time to increase by ~60%.

  1. Generating a @Suite (either at compile time or dynamically via something like AllTests) clashes with --flaky_test_attempts as, if any test case fails, the entire suite is detected as having failed and so all tests are run again. This can be somewhat mitigated by sharding but there's a cap of 50 on the number of shards.

Option 1 is "okay" in cases where there are few, long-running tests; option 2 is "okay" for a lot of fast running tests. It would be nice to have a "one-size-fits-all" solution.

There has been an open issue in the main Bazel repository for a few years now that has a bit of overlap, but that one seems a bit more focused around convenience rather than performance. As the Java rules are being broken out, I thought it might make sense to move it over here for an updated discussion.

When I came across the original issue, I did a very quick-and-hacky PoC of how the built-in Bazel test runner could be updated to support multiple classes; however, on revisiting this I'm not sure if that would actually solve the performance problem on its own, as it looks as though the flaky test attempts is handled outside the test runner process, and so I'm guessing this solution would end up just working in the same way as a @Suite.

@hvadehra
Copy link
Member

Thanks for the great writeup!

I also feel the single test class restriction is kind of clunky and we should definitely fix bazelbuild/bazel#2539. But perhaps that would be best done once the rules are out of bazel and in this repo.

re: flaky tests, I think the general guidance is always to fix flakiness :). Internally, what I see is a mix of re-running + ignoring (based on flakiness stats).

Generating a @Suite (either at compile time or dynamically via something like AllTests) clashes with --flaky_test_attempts as, if any test case fails, the entire suite is detected as having failed and so all tests are run again. This can be somewhat mitigated by sharding but there's a cap of 50 on the number of shards.

A partial mitigation could be to extract all your known flaky tests into separate, individual targets, and use the AllTests mechanism for all the other well-behaved targets. It's manual toil but I'm uncertain if there's anything one could do - flaky tests are just inherently problematic.

@hvadehra hvadehra added the P2 We'll consider to work on this in future. (Assignee optional) label May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider to work on this in future. (Assignee optional)
Projects
None yet
Development

No branches or pull requests

2 participants