Skip to content

Commit

Permalink
Add the experimental_use_remote_cache_for_cache_unaware_spawns flag
Browse files Browse the repository at this point in the history
    This new flag is similar in spirit to --remote_accept_cached but allows
    being more selective about what's accepted and what's not.

    The specific problem I face is the following: we have a setup where we
    want to use dynamic execution for performance reasons.  However, we know
    some actions in our build (those run by rules_foreign_cc) are not
    deterministic.  To mitigate this, we force the actions that we know are
    not deterministic to run remotely, without dynamic execution, as this
    will prevent exposing the non-determinism for as long as they are cached
    and until we can fix their problems.

    However, we still observe non-deterministic actions in the build and we
    need to diagnose what those are.  To do this, I need to run two builds
    and compare their execlogs.  And I need these builds to continue to
    reuse the non-deterministic artifacts we _already_ know about from the
    cache, but to rerun other local actions from scratch.

    Unfortunately, the fact that "remote-cache" is not a strategy (see
    bazelbuild#18245) makes this very
    difficult to do because, even if I configure certain actions to run
    locally unconditionally, the spawn strategy insists on checking the
    remote cache for them.

    With this new flag, I can run a build where the remote actions remain
    remote but where I disable the dynamic scheduler and force the remaining
    actions to re-run locally.  I'm marking the flag as experimental because
    this feels like a huge kludge to paper over the fact that the remote
    cache should really be a strategy, but isn't.  In other words: this
    flag should go away with a better rearchitecting of the remote caching
    interface.

Upstream PR: bazelbuild#18944
Author: Julio Merino <[email protected]>
Date:   Fri Jul 14 10:32:41 2023 -0700

Description

Testing
  • Loading branch information
sfc-gh-mgalindo committed Jun 21, 2024
1 parent fe3f60c commit 11c0a70
Show file tree
Hide file tree
Showing 15 changed files with 92 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -530,6 +530,21 @@ public boolean usingLocalTestJobs() {
+ " code 39.")
public int remoteRetryOnCacheEviction;

@Option(
name = "experimental_use_remote_cache_for_cache_unaware_spawns",
defaultValue = "true",
documentationCategory = OptionDocumentationCategory.REMOTE,
effectTags = {OptionEffectTag.UNKNOWN},
help =
"If set to true (the default), allow spawns other than \"remote\" to use remote caching. "
+ "Setting this to false is useful when troubleshooting non-determinism cases in a "
+ "\"mixed\" configuration where certain spawns are forced to run remotely (for "
+ "example, to hide their non-determinism) and others are allowed to run via dynamic "
+ "execution. In such cases, we must allow the remote spawns to leverage the remote "
+ "cache, but not those that are configured to run locally. This flag is similar in "
+ "purpose to --remote_accept_cached.")
public boolean useRemoteCacheForCacheUnawareSpawns;

/** An enum for specifying different formats of test output. */
public enum TestOutputFormat {
SUMMARY, // Provide summary output only.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ java_library(
"//src/main/java/com/google/devtools/build/lib/actions:resource_manager",
"//src/main/java/com/google/devtools/build/lib/concurrent",
"//src/main/java/com/google/devtools/build/lib/exec:bin_tools",
"//src/main/java/com/google/devtools/build/lib/exec:execution_options",
"//src/main/java/com/google/devtools/build/lib/exec:runfiles_tree_updater",
"//src/main/java/com/google/devtools/build/lib/exec:spawn_runner",
"//src/main/java/com/google/devtools/build/lib/profiler",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
import com.google.devtools.build.lib.actions.cache.VirtualActionInput;
import com.google.devtools.build.lib.concurrent.ThreadSafety.ThreadSafe;
import com.google.devtools.build.lib.exec.BinTools;
import com.google.devtools.build.lib.exec.ExecutionOptions;
import com.google.devtools.build.lib.exec.RunfilesTreeUpdater;
import com.google.devtools.build.lib.exec.SpawnExecutingEvent;
import com.google.devtools.build.lib.exec.SpawnRunner;
Expand Down Expand Up @@ -100,9 +101,11 @@ public class LocalSpawnRunner implements SpawnRunner {
private final BinTools binTools;

private final RunfilesTreeUpdater runfilesTreeUpdater;
private final boolean handlesCaching;

public LocalSpawnRunner(
Path execRoot,
ExecutionOptions executionOptions,
LocalExecutionOptions localExecutionOptions,
ResourceManager resourceManager,
LocalEnvProvider localEnvProvider,
Expand All @@ -117,6 +120,7 @@ public LocalSpawnRunner(
this.localEnvProvider = localEnvProvider;
this.binTools = binTools;
this.runfilesTreeUpdater = runfilesTreeUpdater;
this.handlesCaching = !executionOptions.useRemoteCacheForCacheUnawareSpawns;
}

@Override
Expand Down Expand Up @@ -182,7 +186,7 @@ public boolean canExec(Spawn spawn) {

@Override
public boolean handlesCaching() {
return false;
return handlesCaching;
}

protected Path createActionTemp(Path execRoot) throws IOException {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ abstract class AbstractSandboxSpawnRunner implements SpawnRunner {
private final Path execRoot;
private final ResourceManager resourceManager;
private final Reporter reporter;
private final boolean handlesCaching;

public AbstractSandboxSpawnRunner(CommandEnvironment cmdEnv) {
this.sandboxOptions = cmdEnv.getOptions().getOptions(SandboxOptions.class);
Expand All @@ -88,6 +89,7 @@ public AbstractSandboxSpawnRunner(CommandEnvironment cmdEnv) {
this.execRoot = cmdEnv.getExecRoot();
this.resourceManager = cmdEnv.getLocalResourceManager();
this.reporter = cmdEnv.getReporter();
this.handlesCaching = !cmdEnv.getOptions().getOptions(ExecutionOptions.class).useRemoteCacheForCacheUnawareSpawns;
}

@Override
Expand Down Expand Up @@ -132,7 +134,7 @@ public boolean canExec(Spawn spawn) {

@Override
public boolean handlesCaching() {
return false;
return this.handlesCaching;
}

protected abstract SandboxedSpawn prepareSpawn(Spawn spawn, SpawnExecutionContext context)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -420,6 +420,7 @@ private static SandboxFallbackSpawnRunner withFallback(
return new SandboxFallbackSpawnRunner(
sandboxSpawnRunner,
createFallbackRunner(env),
env.getOptions().getOptions(ExecutionOptions.class),
env.getReporter(),
sandboxOptions != null && sandboxOptions.legacyLocalFallback);
}
Expand All @@ -429,6 +430,7 @@ private static SpawnRunner createFallbackRunner(CommandEnvironment env) {
env.getOptions().getOptions(LocalExecutionOptions.class);
return new LocalSpawnRunner(
env.getExecRoot(),
env.getOptions().getOptions(ExecutionOptions.class),
localExecutionOptions,
env.getLocalResourceManager(),
LocalEnvProvider.forCurrentOs(env.getClientEnv()),
Expand All @@ -450,16 +452,19 @@ private static final class SandboxFallbackSpawnRunner implements SpawnRunner {
private final ExtendedEventHandler reporter;
private static final AtomicBoolean warningEmitted = new AtomicBoolean();
private final boolean fallbackAllowed;
private final boolean handlesCaching;

SandboxFallbackSpawnRunner(
SpawnRunner sandboxSpawnRunner,
SpawnRunner fallbackSpawnRunner,
ExecutionOptions executionOptions,
ExtendedEventHandler reporter,
boolean fallbackAllowed) {
this.sandboxSpawnRunner = sandboxSpawnRunner;
this.fallbackSpawnRunner = fallbackSpawnRunner;
this.reporter = reporter;
this.fallbackAllowed = fallbackAllowed;
this.handlesCaching = !executionOptions.useRemoteCacheForCacheUnawareSpawns;
}

@Override
Expand Down Expand Up @@ -505,7 +510,7 @@ public boolean canExecWithLegacyFallback(Spawn spawn) {

@Override
public boolean handlesCaching() {
return false;
return handlesCaching;
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ public void registerSpawnStrategies(
SpawnRunner localSpawnRunner =
new LocalSpawnRunner(
env.getExecRoot(),
env.getOptions().getOptions(ExecutionOptions.class),
env.getOptions().getOptions(LocalExecutionOptions.class),
env.getLocalResourceManager(),
LocalEnvProvider.forCurrentOs(env.getClientEnv()),
Expand Down
1 change: 1 addition & 0 deletions src/main/java/com/google/devtools/build/lib/worker/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ java_library(
"//src/main/java/com/google/devtools/build/lib/clock",
"//src/main/java/com/google/devtools/build/lib/events",
"//src/main/java/com/google/devtools/build/lib/exec:bin_tools",
"//src/main/java/com/google/devtools/build/lib/exec:execution_options",
"//src/main/java/com/google/devtools/build/lib/exec:runfiles_tree_updater",
"//src/main/java/com/google/devtools/build/lib/exec:spawn_runner",
"//src/main/java/com/google/devtools/build/lib/exec/local",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -225,6 +225,7 @@ public void registerSpawnStrategies(
env.getBlazeWorkspace().getBinTools(),
env.getLocalResourceManager(),
RunfilesTreeUpdater.forCommandEnvironment(env),
env.getOptions().getOptions(ExecutionOptions.class),
env.getOptions().getOptions(WorkerOptions.class),
WorkerMetricsCollector.instance(),
env.getClock());
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
import com.google.devtools.build.lib.clock.Clock;
import com.google.devtools.build.lib.events.ExtendedEventHandler;
import com.google.devtools.build.lib.exec.BinTools;
import com.google.devtools.build.lib.exec.ExecutionOptions;
import com.google.devtools.build.lib.exec.RunfilesTreeUpdater;
import com.google.devtools.build.lib.exec.SpawnExecutingEvent;
import com.google.devtools.build.lib.exec.SpawnRunner;
Expand Down Expand Up @@ -94,6 +95,7 @@ final class WorkerSpawnRunner implements SpawnRunner {
private final WorkerParser workerParser;
private final AtomicInteger requestIdCounter = new AtomicInteger(1);
private final WorkerMetricsCollector metricsCollector;
private final boolean handlesCaching;

public WorkerSpawnRunner(
SandboxHelpers helpers,
Expand All @@ -104,6 +106,7 @@ public WorkerSpawnRunner(
BinTools binTools,
ResourceManager resourceManager,
RunfilesTreeUpdater runfilesTreeUpdater,
ExecutionOptions executionOptions,
WorkerOptions workerOptions,
WorkerMetricsCollector workerMetricsCollector,
Clock clock) {
Expand All @@ -117,6 +120,7 @@ public WorkerSpawnRunner(
this.resourceManager.setWorkerPool(workers);
this.metricsCollector = workerMetricsCollector;
this.metricsCollector.setClock(clock);
this.handlesCaching = !executionOptions.useRemoteCacheForCacheUnawareSpawns;
}

@Override
Expand All @@ -143,7 +147,7 @@ public boolean canExec(Spawn spawn) {

@Override
public boolean handlesCaching() {
return false;
return handlesCaching;
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ java_test(
"//src/main/java/com/google/devtools/build/lib/actions:localhost_capacity",
"//src/main/java/com/google/devtools/build/lib/actions:resource_manager",
"//src/main/java/com/google/devtools/build/lib/exec:bin_tools",
"//src/main/java/com/google/devtools/build/lib/exec:execution_options",
"//src/main/java/com/google/devtools/build/lib/exec:runfiles_tree_updater",
"//src/main/java/com/google/devtools/build/lib/exec:spawn_runner",
"//src/main/java/com/google/devtools/build/lib/exec/local",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
import com.google.devtools.build.lib.actions.cache.VirtualActionInput;
import com.google.devtools.build.lib.actions.util.ActionsTestUtil;
import com.google.devtools.build.lib.exec.BinTools;
import com.google.devtools.build.lib.exec.ExecutionOptions;
import com.google.devtools.build.lib.exec.RunfilesTreeUpdater;
import com.google.devtools.build.lib.exec.SpawnExecutingEvent;
import com.google.devtools.build.lib.exec.SpawnSchedulingEvent;
Expand Down Expand Up @@ -107,6 +108,7 @@ public TestedLocalSpawnRunner(
LocalEnvProvider localEnvProvider) {
super(
execRoot,
Options.getDefaults(ExecutionOptions.class),
localExecutionOptions,
resourceManager,
localEnvProvider,
Expand Down Expand Up @@ -621,6 +623,7 @@ public void interruptWaitsForProcessExit() throws Exception {
LocalSpawnRunner runner =
new LocalSpawnRunner(
tempDir,
Options.getDefaults(ExecutionOptions.class),
Options.getDefaults(LocalExecutionOptions.class),
resourceManager,
LocalEnvProvider.forCurrentOs(ImmutableMap.of()),
Expand Down Expand Up @@ -858,6 +861,7 @@ public void hasExecutionStatistics() throws Exception {

FileSystem fs = new UnixFileSystem(DigestHashFunction.SHA256, /*hashAttributeName=*/ "");

ExecutionOptions executionOptions = Options.getDefaults(ExecutionOptions.class);
LocalExecutionOptions options = Options.getDefaults(LocalExecutionOptions.class);

int minimumWallTimeToSpendInMs = 10 * 1000;
Expand All @@ -883,6 +887,7 @@ public void hasExecutionStatistics() throws Exception {
LocalSpawnRunner runner =
new LocalSpawnRunner(
execRoot,
executionOptions,
options,
resourceManager,
LocalSpawnRunnerTest::keepLocalEnvUnchanged,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,7 @@ public final void setUp() throws Exception {
OptionsParser optionsParser =
OptionsParser.builder().optionsClasses(ExecutionOptions.class).build();
optionsParser.parse("--verbose_failures");
ExecutionOptions executionOptions = Options.getDefaults(ExecutionOptions.class);
LocalExecutionOptions localExecutionOptions = Options.getDefaults(LocalExecutionOptions.class);

ResourceManager resourceManager = ResourceManager.instanceForTestingOnly();
Expand All @@ -150,6 +151,7 @@ public final void setUp() throws Exception {
execRoot,
new LocalSpawnRunner(
execRoot,
executionOptions,
localExecutionOptions,
resourceManager,
(env, binTools1, fallbackTmpDir) -> ImmutableMap.copyOf(env),
Expand Down
1 change: 1 addition & 0 deletions src/test/java/com/google/devtools/build/lib/worker/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,7 @@ java_library(
"//src/main/java/com/google/devtools/build/lib/collect/nestedset",
"//src/main/java/com/google/devtools/build/lib/events",
"//src/main/java/com/google/devtools/build/lib/exec:bin_tools",
"//src/main/java/com/google/devtools/build/lib/exec:execution_options",
"//src/main/java/com/google/devtools/build/lib/exec:spawn_runner",
"//src/main/java/com/google/devtools/build/lib/exec/local",
"//src/main/java/com/google/devtools/build/lib/metrics:ps_info_collector",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@
import com.google.devtools.build.lib.collect.nestedset.NestedSetBuilder;
import com.google.devtools.build.lib.collect.nestedset.Order;
import com.google.devtools.build.lib.events.ExtendedEventHandler;
import com.google.devtools.build.lib.exec.ExecutionOptions;
import com.google.devtools.build.lib.exec.SpawnExecutingEvent;
import com.google.devtools.build.lib.exec.SpawnRunner.SpawnExecutionContext;
import com.google.devtools.build.lib.exec.local.LocalEnvProvider;
Expand Down Expand Up @@ -174,6 +175,7 @@ public void testExecInWorker_virtualInputs_doesntQueryInputFileCache()
/* binTools= */ null,
resourceManager,
/* runfilesTreeUpdater= */ null,
new ExecutionOptions(),
new WorkerOptions(),
metricsCollector,
new JavaClock());
Expand Down Expand Up @@ -568,6 +570,7 @@ private WorkerSpawnRunner createWorkerSpawnRunner(WorkerOptions workerOptions) {
/* binTools= */ null,
resourceManager,
/* runfilesTreeUpdater= */ null,
new ExecutionOptions(),
workerOptions,
metricsCollector,
new JavaClock());
Expand Down
42 changes: 42 additions & 0 deletions src/test/shell/bazel/remote/remote_execution_test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -1084,6 +1084,48 @@ EOF
expect_not_log "remote cache hit"
}

function test_no_use_remote_cache_for_cache_unaware_spawns() {
mkdir -p a
cat > a/BUILD <<'EOF'
genrule(
name = "foo",
srcs = [],
outs = ["foo.txt"],
cmd = "echo \"$$RANDOM\" > \"$@\"",
)
EOF

# Build the non-deterministic target to inject its output into the cache.
bazel build \
--remote_executor=grpc://localhost:${worker_port} \
//a:foo >& $TEST_log || fail "Failed to build //a:foo"
cp bazel-bin/a/foo.txt before.txt

# Do a second build with the default value of the
# --experimental_use_remote_cache_for_cache_unaware_spawns flag, which we
# expect it to be true (so the build should reuse the cached result and not
# be subject to non-determinism).
bazel clean
bazel build \
--remote_executor=grpc://localhost:${worker_port} \
--spawn_strategy=sandboxed,local \
//a:foo >& $TEST_log || fail "Failed to build //a:foo"
diff -u before.txt bazel-bin/a/foo.txt \
|| fail "Non-deterministic action should have been reused from the cache"

# Do a third build now, disallowing the use of the remote cache and thus
# observing the non-determinism.
bazel clean
bazel build \
--remote_executor=grpc://localhost:${worker_port} \
--experimental_use_remote_cache_for_cache_unaware_spawns=false \
--spawn_strategy=sandboxed,local \
//a:foo >& $TEST_log || fail "Failed to build //a:foo"
if diff -u before.txt bazel-bin/a/foo.txt; then
fail "Non-deterministic action should have been reexecuted, not cached"
fi
}

function test_tag_no_cache() {
mkdir -p a
cat > a/BUILD <<'EOF'
Expand Down

0 comments on commit 11c0a70

Please sign in to comment.