Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[7.4.0] Make it possible to get a disk/remote cache hit for a test following a failure of the same test. #23673

Merged
merged 1 commit into from
Sep 19, 2024

Conversation

fmeum
Copy link
Collaborator

@fmeum fmeum commented Sep 19, 2024

A source edit might have occurred in the interim, putting the test back into a passing state that has been previously stored in the cache. The current logic disables disk/remote caching when rerunning a test following a failure, which causes spurious reexecutions (see issue #11057).

Although this code has been around for a very long time, it seems that it was originally added to ensure that a flaky test failure doesn't become sticky. However, for this purpose it suffices to bust the local cache, as a disk/remote cache should never store failures (if it did, this would not be a complete fix, and we'd have much bigger problems elsewhere). Note that Bazel itself never uploads local failures to a disk/remote cache, but can't control whether a remote executor does.

The fix is to split the TestRunnerAction.canBeCached logic into two versions: one that takes into account a previous test result (for the local cache check) and another that doesn't (to determine whether we should attempt to hit the disk/remote cache).

Fixes #11057.

PiperOrigin-RevId: 661319835
Change-Id: I9248cbfa31dd135b2bda971e48bb17d7f828889c
(cherry picked from commit e9709b7)

Fixes #23259

@fmeum fmeum requested a review from a team as a code owner September 19, 2024 08:45
@fmeum fmeum changed the title Make it possible to get a disk/remote cache hit for a test following … [7.4.0] Make it possible to get a disk/remote cache hit for a test following … Sep 19, 2024
@github-actions github-actions bot added awaiting-review PR is awaiting review from an assigned reviewer team-Configurability platforms, toolchains, cquery, select(), config transitions labels Sep 19, 2024
…a failure of the same test.

A source edit might have occurred in the interim, putting the test back into a passing state that has been previously stored in the cache. The current logic disables disk/remote caching when rerunning a test following a failure, which causes spurious reexecutions (see issue bazelbuild#11057).

Although this code has been around for a very long time, it seems that it was originally added to ensure that a flaky test failure doesn't become sticky. However, for this purpose it suffices to bust the local cache, as a disk/remote cache should never store failures (if it did, this would not be a complete fix, and we'd have much bigger problems elsewhere). Note that Bazel itself never uploads local failures to a disk/remote cache, but can't control whether a remote executor does.

The fix is to split the `TestRunnerAction.canBeCached` logic into two versions: one that takes into account a previous test result (for the local cache check) and another that doesn't (to determine whether we should attempt to hit the disk/remote cache).

Fixes bazelbuild#11057.

PiperOrigin-RevId: 661319835
Change-Id: I9248cbfa31dd135b2bda971e48bb17d7f828889c
(cherry picked from commit e9709b7)
@iancha1992 iancha1992 added this pull request to the merge queue Sep 19, 2024
@tjgq tjgq changed the title [7.4.0] Make it possible to get a disk/remote cache hit for a test following … [7.4.0] Make it possible to get a disk/remote cache hit for a test following a failure of the same test. Sep 19, 2024
Merged via the queue into bazelbuild:release-7.4.0 with commit 56fedfd Sep 19, 2024
50 checks passed
@github-actions github-actions bot removed the awaiting-review PR is awaiting review from an assigned reviewer label Sep 19, 2024
@fmeum fmeum deleted the 23259-cherry branch September 23, 2024 09:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Configurability platforms, toolchains, cquery, select(), config transitions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants