Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CacheNotFoundException when resuming build after remote cache evicted objects #19348

Closed
sluongng opened this issue Aug 28, 2023 · 15 comments
Closed
Assignees
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug

Comments

@sluongng
Copy link
Contributor

Description of the bug:

Our remote cache evicts old objects in an LRU fashion.

This means that if you have not built for a while, especially if your config is unique, it's likely that your cache object will get evicted.

Today I resumed my M1 Macbook laptop after a weekend break and got a stack trace like this after my first build

com.google.devtools.build.lib.remote.common.BulkTransferException: 3 errors during bulk transfer:
com.google.devtools.build.lib.remote.common.CacheNotFoundException: Missing digest: d0387e622e30ab61e39b1b91e54ea50f9915789dde7b950fafb0863db4a32ef8/17096
com.google.devtools.build.lib.remote.common.CacheNotFoundException: Missing digest: 9718647251c8d479142d459416079ff5cd9f45031a47aa346d8a6e719e374ffa/28630
com.google.devtools.build.lib.remote.common.CacheNotFoundException: Missing digest: 785e0ead607a37bd9a12179051e6efe53d7fb3eb05cc291e49ad6965ee2b613d/11504
        at com.google.devtools.build.lib.remote.util.RxUtils$BulkTransferExceptionCollector.onResult(RxUtils.java:91)
        ...
        at com.google.devtools.build.lib.remote.RemoteExecutionCache$1.onError(RemoteExecutionCache.java:232)
        ...
        at com.google.devtools.build.lib.remote.util.AsyncTaskCache$1.onError(AsyncTaskCache.java:340)
        at com.google.devtools.build.lib.remote.util.AsyncTaskCache$Execution.onError(AsyncTaskCache.java:206)
        ...
        at com.google.devtools.build.lib.remote.util.RxFutures$OnceCompletableOnSubscribe$1.onFailure(RxFutures.java:102)
        ...
        at com.google.devtools.build.lib.remote.util.RxFutures$2.onError(RxFutures.java:257)
        ...
        at com.google.devtools.build.lib.remote.util.RxFutures$OnceSingleOnSubscribe$1.onFailure(RxFutures.java:172)
        ...
        at com.google.devtools.build.lib.remote.ByteStreamUploader$Writer.seekChunker(ByteStreamUploader.java:489)
        at com.google.devtools.build.lib.remote.ByteStreamUploader$Writer.run(ByteStreamUploader.java:442)
        ...
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.base/java.lang.Thread.run(Unknown Source)
        Suppressed: com.google.devtools.build.lib.remote.common.CacheNotFoundException: Missing digest: d0387e622e30ab61e39b1b91e54ea50f9915789dde7b950fafb0863db4a32ef8/17096
                at com.google.devtools.build.lib.remote.GrpcCacheClient$1.onError(GrpcCacheClient.java:420)
                ...
                at com.google.devtools.build.lib.remote.NetworkTimeInterceptor$NetworkTimeCall$1.onClose(NetworkTimeInterceptor.java:81)
                ...
                ... 5 more
        Suppressed: com.google.devtools.build.lib.remote.common.CacheNotFoundException: Missing digest: 9718647251c8d479142d459416079ff5cd9f45031a47aa346d8a6e719e374ffa/28630
                at com.google.devtools.build.lib.remote.GrpcCacheClient$1.onError(GrpcCacheClient.java:420)
                ...
                at com.google.devtools.build.lib.remote.NetworkTimeInterceptor$NetworkTimeCall$1.onClose(NetworkTimeInterceptor.java:81)
                ...
                ... 5 more
        Suppressed: com.google.devtools.build.lib.remote.common.CacheNotFoundException: Missing digest: 785e0ead607a37bd9a12179051e6efe53d7fb3eb05cc291e49ad6965ee2b613d/11504
                at com.google.devtools.build.lib.remote.GrpcCacheClient$1.onError(GrpcCacheClient.java:420)
                ...
                at com.google.devtools.build.lib.remote.NetworkTimeInterceptor$NetworkTimeCall$1.onClose(NetworkTimeInterceptor.java:81)
                ...
                ... 5 more

This is not fixed after several retries, but it seems to be fixed after I went for lunch and came back to the laptop (no changes made). Assuming this was caused by of idle shutdown of Bazel JVM.

I think the correct expectation here is for Bazel to tell the remote cache / remote executor to re-run the action, but it seems like there could be edge cases that are not being handled properly.

Which category does this issue belong to?

Remote Execution

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Not sure just yet.

We have "build without bytes" turned on with GRPC cache (no disk cache) and remote execution enabled.

Which operating system are you running Bazel on?

MacOS 13.5.1 darwin64

What is the output of bazel info release?

release 6.3.1

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.

Irrelevant as the issue would go away if the JVM restart.

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

The problem is not exclusive to MacOS environment, however.

We have seen similar reports from our customers, who set up their Linux CI against our remote cache as well.
Typically, it would be something like: do first build, wait several weeks, remote cache evicted, do second build -> similar failure.

Because of this, we have not been able to reproduce the situation reliably.

@iancha1992 iancha1992 added the team-Remote-Exec Issues and PRs for the Execution (Remote) team label Aug 28, 2023
@oquenchil oquenchil added P2 We'll consider working on this in future. (Assignee optional) and removed untriaged labels Aug 29, 2023
@coeuvre
Copy link
Member

coeuvre commented Sep 6, 2023

I agree the correct expectation is for Bazel to rerun the action. You mentioned you had several retries, did they fail with the same missing digests?

@sluongng
Copy link
Contributor Author

sluongng commented Sep 6, 2023

Yup they failed with the same missing digests.

@sluongng
Copy link
Contributor Author

ERROR: /Users/sluongng/work/buildbuddy/buildbuddy/proto/BUILD:86:14: Generating Descriptor Set proto_library //proto:config_proto failed: (Exit 34): Missing digest: 80b9e5491f9626ee26828116d5e016689dafd368783ecadcb939456ba3d25cc5/5798416 for bazel-out/platform_linux-opt-exec-34F00540-ST-094ddd67efaf/bin/external/com_google_protobuf/protoc
com.google.devtools.build.lib.remote.common.BulkTransferException: Missing digest: 80b9e5491f9626ee26828116d5e016689dafd368783ecadcb939456ba3d25cc5/5798416 for bazel-out/platform_linux-opt-exec-34F00540-ST-094ddd67efaf/bin/external/com_google_protobuf/protoc
        at com.google.devtools.build.lib.remote.util.RxUtils$BulkTransferExceptionCollector.onResult(RxUtils.java:91)
        ...
        at com.google.devtools.build.lib.remote.RemoteExecutionCache$1.onError(RemoteExecutionCache.java:232)
        ...
        at com.google.devtools.build.lib.remote.util.AsyncTaskCache$1.onError(AsyncTaskCache.java:340)
        at com.google.devtools.build.lib.remote.util.AsyncTaskCache$Execution.onError(AsyncTaskCache.java:206)
        ...
        at com.google.devtools.build.lib.remote.util.RxFutures$OnceCompletableOnSubscribe$1.onFailure(RxFutures.java:102)
        ...
        at com.google.devtools.build.lib.remote.util.RxFutures$2.onError(RxFutures.java:257)
        ...
        at com.google.devtools.build.lib.remote.util.RxFutures$OnceSingleOnSubscribe$1.onFailure(RxFutures.java:172)
        ...
        at com.google.devtools.build.lib.remote.ByteStreamUploader$Writer.seekChunker(ByteStreamUploader.java:509)
        at com.google.devtools.build.lib.remote.ByteStreamUploader$Writer.run(ByteStreamUploader.java:462)
        ...
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.base/java.lang.Thread.run(Unknown Source)
        Suppressed: com.google.devtools.build.lib.remote.common.CacheNotFoundException: Missing digest: 80b9e5491f9626ee26828116d5e016689dafd368783ecadcb939456ba3d25cc5/5798416 for bazel-out/platform_linux-opt-exec-34F00540-ST-094ddd67efaf/bin/external/com_google_protobuf/protoc
                at com.google.devtools.build.lib.remote.GrpcCacheClient$1.onError(GrpcCacheClient.java:436)
                ...
                at com.google.devtools.build.lib.remote.NetworkTimeInterceptor$NetworkTimeCall$1.onClose(NetworkTimeInterceptor.java:81)
                ...
                ... 5 more

Facing this issue again today on Bazel 6.4.0rc1.
The exception is BulkTransferException this time and the stack trace is slightly different.

Issue goes away on immediate retry 🤔

@coeuvre
Copy link
Member

coeuvre commented Oct 11, 2023

Was there any automatic retries?

@sluongng
Copy link
Contributor Author

No auto retry for me.

@coeuvre
Copy link
Member

coeuvre commented Oct 11, 2023

It looks like in this corner case Bazel wasn't able to detect the cache eviction error and retry. NOT Bazel wasn't able to clear the stale state.

ByteStreamUploader looks suspicious in the stack trace. It seems like the scenario was Bazel was trying upload an input to CAS for remote execution (because it was evicted).

@dieortin
Copy link

I'm running into the same problem with bazel 6.4.0

@iancha1992
Copy link
Member

cc: @coeuvre

@tjgq
Copy link
Contributor

tjgq commented Aug 29, 2024

This should have been fixed by eda0fe4. If not, please reopen.

@tjgq tjgq closed this as completed Aug 29, 2024
@andyliuliming
Copy link

looks like issue still exists in 7.3.1 version.

@coeuvre
Copy link
Member

coeuvre commented Oct 10, 2024

There is another underlying issue which should be fixed by 9187a7e and is included in the upcoming 7.4.0 release.

@andyliuliming
Copy link

tried the version "8.0.0rc1"

looks like still have the issue.
the exception is like this:

GoCompilePkg external/gazelle++go_deps+org_golang_x_sys/unix/unix.a; 0s remote, remote-cache ... (4 actions running)
ERROR: /home/andliu/.aksbuilder/bazeloutput/baf01025d79087845bcf1f8c4dead19c/external/gazelle++go_deps+org_golang_x_net/idna/BUILD.bazel:3:11: GoCompilePkg external/gazelle++go_deps+org_golang_x_net/idna/idna.a failed: (Exit 34): Missing digest: ec1283321f22f4db3cc8a68771c03b77600a360d92c5b5c8280f8d0ddbc3e89e/5692 for /home/andliu/.aksbuilder/bazeloutput/baf01025d79087845bcf1f8c4dead19c/execroot/_main/bazel-out/k8-fastbuild-ST-6d4313973ce7/bin/external/gazelle++go_deps+org_golang_x_text/secure/bidirule/bidirule.x
com.google.devtools.build.lib.remote.common.BulkTransferException: Missing digest: ec1283321f22f4db3cc8a68771c03b77600a360d92c5b5c8280f8d0ddbc3e89e/5692 for /home/andliu/.aksbuilder/bazeloutput/baf01025d79087845bcf1f8c4dead19c/execroot/_main/bazel-out/k8-fastbuild-ST-6d4313973ce7/bin/external/gazelle++go_deps+org_golang_x_text/secure/bidirule/bidirule.x
at com.google.devtools.build.lib.remote.util.RxUtils$BulkTransferExceptionCollector.onResult(RxUtils.java:112)
at io.reactivex.rxjava3.internal.operators.flowable.FlowableCollectSingle$CollectSubscriber.onNext(FlowableCollectSingle.java:94)
at io.reactivex.rxjava3.internal.operators.single.SingleFlatMapPublisher$SingleFlatMapPublisherObserver.onNext(SingleFlatMapPublisher.java:107)
at io.reactivex.rxjava3.internal.operators.flowable.FlowableUsing$UsingSubscriber.onNext(FlowableUsing.java:104)
at io.reactivex.rxjava3.internal.operators.single.SingleFlatMapPublisher$SingleFlatMapPublisherObserver.onNext(SingleFlatMapPublisher.java:107)
at io.reactivex.rxjava3.internal.operators.flowable.FlowableUsing$UsingSubscriber.onNext(FlowableUsing.java:104)
at io.reactivex.rxjava3.internal.operators.flowable.FlowableFlatMapSingle$FlatMapSingleSubscriber.innerSuccess(FlowableFlatMapSingle.java:173)
at io.reactivex.rxjava3.internal.operators.flowable.FlowableFlatMapSingle$FlatMapSingleSubscriber$InnerObserver.onSuccess(FlowableFlatMapSingle.java:342)
at io.reactivex.rxjava3.internal.observers.ResumeSingleObserver.onSuccess(ResumeSingleObserver.java:46)
at io.reactivex.rxjava3.internal.operators.single.SingleJust.subscribeActual(SingleJust.java:30)
at io.reactivex.rxjava3.core.Single.subscribe(Single.java:4855)
at io.reactivex.rxjava3.internal.operators.single.SingleResumeNext$ResumeMainSingleObserver.onError(SingleResumeNext.java:80)
at io.reactivex.rxjava3.internal.operators.completable.CompletableToSingle$ToSingle.onError(CompletableToSingle.java:73)
at io.reactivex.rxjava3.internal.operators.completable.CompletableFromObservable$CompletableFromObservableObserver.onError(CompletableFromObservable.java:51)
at io.reactivex.rxjava3.subjects.AsyncSubject.subscribeActual(AsyncSubject.java:229)
at io.reactivex.rxjava3.core.Observable.subscribe(Observable.java:13176)
at io.reactivex.rxjava3.internal.operators.completable.CompletableFromObservable.subscribeActual(CompletableFromObservable.java:29)
at io.reactivex.rxjava3.core.Completable.subscribe(Completable.java:2859)
at io.reactivex.rxjava3.internal.operators.completable.CompletableToSingle.subscribeActual(CompletableToSingle.java:37)
at io.reactivex.rxjava3.core.Single.subscribe(Single.java:4855)
at io.reactivex.rxjava3.internal.operators.single.SingleResumeNext.subscribeActual(SingleResumeNext.java:39)
at io.reactivex.rxjava3.core.Single.subscribe(Single.java:4855)
at io.reactivex.rxjava3.internal.operators.flowable.FlowableFlatMapSingle$FlatMapSingleSubscriber.onNext(FlowableFlatMapSingle.java:131)
at io.reactivex.rxjava3.internal.operators.flowable.FlowableFromIterable$IteratorSubscription.fastPath(FlowableFromIterable.java:185)
at io.reactivex.rxjava3.internal.operators.flowable.FlowableFromIterable$BaseRangeSubscription.request(FlowableFromIterable.java:129)
at io.reactivex.rxjava3.internal.operators.flowable.FlowableFlatMapSingle$FlatMapSingleSubscriber.onSubscribe(FlowableFlatMapSingle.java:106)
at io.reactivex.rxjava3.internal.operators.flowable.FlowableFromIterable.subscribe(FlowableFromIterable.java:69)
at io.reactivex.rxjava3.internal.operators.flowable.FlowableFromIterable.subscribeActual(FlowableFromIterable.java:47)
at io.reactivex.rxjava3.core.Flowable.subscribe(Flowable.java:15917)
at io.reactivex.rxjava3.internal.operators.flowable.FlowableFlatMapSingle.subscribeActual(FlowableFlatMapSingle.java:53)
at io.reactivex.rxjava3.core.Flowable.subscribe(Flowable.java:15917)
at io.reactivex.rxjava3.core.Flowable.subscribe(Flowable.java:15863)
at io.reactivex.rxjava3.internal.operators.flowable.FlowableUsing.subscribeActual(FlowableUsing.java:73)
at io.reactivex.rxjava3.core.Flowable.subscribe(Flowable.java:15917)
at io.reactivex.rxjava3.core.Flowable.subscribe(Flowable.java:15863)
at io.reactivex.rxjava3.internal.operators.single.SingleFlatMapPublisher$SingleFlatMapPublisherObserver.onSuccess(SingleFlatMapPublisher.java:96)
at io.reactivex.rxjava3.internal.operators.single.SingleUsing$UsingSingleObserver.onSuccess(SingleUsing.java:154)
at io.reactivex.rxjava3.internal.operators.observable.ObservableSingleSingle$SingleElementObserver.onComplete(ObservableSingleSingle.java:110)
at io.reactivex.rxjava3.internal.observers.DeferredScalarDisposable.complete(DeferredScalarDisposable.java:85)
at io.reactivex.rxjava3.subjects.AsyncSubject.onComplete(AsyncSubject.java:189)
at io.reactivex.rxjava3.internal.observers.DeferredScalarDisposable.complete(DeferredScalarDisposable.java:85)
at io.reactivex.rxjava3.internal.operators.single.SingleToObservable$SingleToObservableObserver.onSuccess(SingleToObservable.java:73)
at io.reactivex.rxjava3.internal.operators.single.SingleMap$MapSingleObserver.onSuccess(SingleMap.java:65)
at io.reactivex.rxjava3.internal.operators.single.SingleCreate$Emitter.onSuccess(SingleCreate.java:68)
at com.google.devtools.build.lib.remote.util.RxFutures$OnceSingleOnSubscribe$1.onSuccess(RxFutures.java:155)
at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1137)
at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1298)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1059)
at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:784)
at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:121)
at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1298)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1059)
at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:784)
at com.google.common.util.concurrent.CombinedFuture$CallableInterruptibleTask.setValue(CombinedFuture.java:202)
at com.google.common.util.concurrent.CombinedFuture$CombinedFutureInterruptibleTask.afterRanInterruptiblySuccess(CombinedFuture.java:130)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:89)
at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
at com.google.common.util.concurrent.CombinedFuture$CombinedFutureInterruptibleTask.execute(CombinedFuture.java:109)
at com.google.common.util.concurrent.CombinedFuture.handleAllCompleted(CombinedFuture.java:66)
at com.google.common.util.concurrent.AggregateFuture.processCompleted(AggregateFuture.java:302)
at com.google.common.util.concurrent.AggregateFuture.decrementCountAndMaybeComplete(AggregateFuture.java:284)
at com.google.common.util.concurrent.AggregateFuture.lambda$init$0(AggregateFuture.java:158)
at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1298)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1059)
at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:784)
at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:121)
at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1298)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1059)
at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:784)
at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:121)
at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1298)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1059)
at com.google.common.util.concurrent.AbstractFuture.setFuture(AbstractFuture.java:852)
at com.google.common.util.concurrent.AbstractTransformFuture$AsyncTransformFuture.setResult(AbstractTransformFuture.java:236)
at com.google.common.util.concurrent.AbstractTransformFuture$AsyncTransformFuture.setResult(AbstractTransformFuture.java:212)
at com.google.common.util.concurrent.AbstractTransformFuture.run(AbstractTransformFuture.java:171)
at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1298)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1059)
at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:784)
at com.google.common.util.concurrent.SettableFuture.set(SettableFuture.java:49)
at com.google.devtools.build.lib.remote.util.RxFutures$2.onSuccess(RxFutures.java:251)
at io.reactivex.rxjava3.internal.operators.single.SingleFlatMap$SingleFlatMapCallback$FlatMapSingleObserver.onSuccess(SingleFlatMap.java:112)
at io.reactivex.rxjava3.internal.operators.single.SingleUsing$UsingSingleObserver.onSuccess(SingleUsing.java:154)
at io.reactivex.rxjava3.internal.operators.single.SingleCreate$Emitter.onSuccess(SingleCreate.java:68)
at com.google.devtools.build.lib.remote.util.RxFutures$OnceSingleOnSubscribe$1.onSuccess(RxFutures.java:155)
at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1137)
at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1298)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1059)
at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:784)
at io.grpc.stub.ClientCalls$GrpcFuture.set(ClientCalls.java:563)
at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:536)
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at com.google.devtools.build.lib.remote.NetworkTimeInterceptor$NetworkTimeCall$1.onClose(NetworkTimeInterceptor.java:81)
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at com.google.devtools.build.lib.remote.logging.LoggingInterceptor$LoggingForwardingCall$1.onClose(LoggingInterceptor.java:157)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:564)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:729)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:710)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
Suppressed: com.google.devtools.build.lib.remote.common.CacheNotFoundException: Missing digest: ec1283321f22f4db3cc8a68771c03b77600a360d92c5b5c8280f8d0ddbc3e89e/5692 for /home/andliu/.aksbuilder/bazeloutput/baf01025d79087845bcf1f8c4dead19c/execroot/_main/bazel-out/k8-fastbuild-ST-6d4313973ce7/bin/external/gazelle++go_deps+org_golang_x_text/secure/bidirule/bidirule.x
at com.google.devtools.build.lib.remote.RemoteExecutionCache.uploadBlob(RemoteExecutionCache.java:199)
at com.google.devtools.build.lib.remote.RemoteExecutionCache.lambda$maybeCreateUploadTask$8(RemoteExecutionCache.java:275)
at com.google.devtools.build.lib.remote.util.RxFutures$OnceCompletableOnSubscribe.subscribe(RxFutures.java:79)
at io.reactivex.rxjava3.internal.operators.completable.CompletableCreate.subscribeActual(CompletableCreate.java:40)
at io.reactivex.rxjava3.core.Completable.subscribe(Completable.java:2859)
at io.reactivex.rxjava3.internal.operators.single.SingleFlatMapCompletable$FlatMapCompletableObserver.onSuccess(SingleFlatMapCompletable.java:91)
at io.reactivex.rxjava3.internal.operators.single.SingleCreate$Emitter.onSuccess(SingleCreate.java:68)
at com.google.devtools.build.lib.remote.RemoteExecutionCache.lambda$findMissingBlobs$16(RemoteExecutionCache.java:331)
at io.reactivex.rxjava3.internal.operators.single.SingleMap$MapSingleObserver.onSuccess(SingleMap.java:58)
... 71 more

            not sure what does the exception mean. does it mean that the bazel is trying to uplloading the cache from the local, but the artifacts not exists?

@coeuvre
Copy link
Member

coeuvre commented Oct 10, 2024

The exception is thrown by https://cs.opensource.google/bazel/bazel/+/master:src/main/java/com/google/devtools/build/lib/remote/RemoteExecutionCache.java;l=199;drc=51bb7b820b80b437c2fc229f49904d3c57df2b4b which means the blob was probably evicted by the remote cache during the build.

How is your remote execution server setup? Did --experimental_remote_cache_eviction_retries help in this case?

@andyliuliming
Copy link

the remote execution setup is the bazelbuildfarm.
and used one nginx as a proxy to do the auth.

the --experimental_remote_cache_eviction_retries looks like can help in such case, but it still print many many Exceptions log which is a bit annoying.

@coeuvre
Copy link
Member

coeuvre commented Oct 14, 2024

This this case, consider giving your remote cache a larger storage space so it wouldn't evict blobs that often.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug
Projects
None yet
Development

No branches or pull requests

9 participants