Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download BazelRegistryJson only once per registry #19292

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ java_library(
"//src/main/java/com/google/devtools/build/lib/events",
"//src/main/java/com/google/devtools/build/lib/util:os",
"//src/main/java/com/google/devtools/build/lib/vfs:pathfragment",
"//third_party:caffeine",
"//third_party:gson",
"//third_party:guava",
],
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ public class IndexRegistry implements Registry {
private final DownloadManager downloadManager;
private final Map<String, String> clientEnv;
private final Gson gson;
private volatile Optional<BazelRegistryJson> bazelRegistryJson;

public IndexRegistry(URI uri, DownloadManager downloadManager, Map<String, String> clientEnv) {
this.uri = uri;
Expand Down Expand Up @@ -141,9 +142,6 @@ private <T> Optional<T> grabJson(String url, Class<T> klass, ExtendedEventHandle
public RepoSpec getRepoSpec(
ModuleKey key, RepositoryName repoName, ExtendedEventHandler eventHandler)
throws IOException, InterruptedException {
Optional<BazelRegistryJson> bazelRegistryJson =
grabJson(
constructUrl(getUrl(), "bazel_registry.json"), BazelRegistryJson.class, eventHandler);
Optional<SourceJson> sourceJson =
grabJson(
constructUrl(
Expand All @@ -158,14 +156,32 @@ public RepoSpec getRepoSpec(
String type = sourceJson.get().type;
switch (type) {
case "archive":
return createArchiveRepoSpec(sourceJson, bazelRegistryJson, key, repoName);
return createArchiveRepoSpec(sourceJson, getBazelRegistryJson(eventHandler), key, repoName);
case "local_path":
return createLocalPathRepoSpec(sourceJson, bazelRegistryJson, key, repoName);
return createLocalPathRepoSpec(
sourceJson, getBazelRegistryJson(eventHandler), key, repoName);
default:
throw new IOException(String.format("Invalid source type for module %s", key));
}
}

@SuppressWarnings("OptionalAssignedToNull")
private Optional<BazelRegistryJson> getBazelRegistryJson(ExtendedEventHandler eventHandler)
throws IOException, InterruptedException {
if (bazelRegistryJson == null) {
synchronized (this) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the synchronization is necessary here? at worst we'll just fetch it again.

I'm usually wary of synchronized (this) because if someone else does synchronized (yourObject), you're suddenly potentially deadlocked.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, although every time this is fetched again results in a slower overall fetch. This currently doesn't matter since all get he's are sequential (see the profile screenshot in the other PR), but might become relevant when we change that. Not sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on second thought, after looking at the profile graph you posted in #19291 (comment), it looks like we might be trying to fetch the registry json file with many threads at around the same time. So synchronizing might be faster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think I read that graph wrong. The MODULE.bazel files are fetched mostly concurrently, right? It's just all the "repo spec" fetches that are sequential. But why are there two "download file:" blocks in each MODULE.bazel fetch?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's for reading yanked info.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah the second download is for the metadata.json to fetch the yanked version. There's a bit of room for improvement there (though not much) since we'd be fetching the same metadata.json file for all versions of the same module.

The much bigger thing is the repo spec fetching, which used to be lazy until we introduced the lockfile. We should definitely parallelize those. Exactly how, I'm not sure yet (we could use the skyframe threads, or just create a separate thread pool maybe)

if (bazelRegistryJson == null) {
bazelRegistryJson =
grabJson(
constructUrl(getUrl(), "bazel_registry.json"),
BazelRegistryJson.class,
eventHandler);
}
}
}
return bazelRegistryJson;
}

private RepoSpec createLocalPathRepoSpec(
Optional<SourceJson> sourceJson,
Optional<BazelRegistryJson> bazelRegistryJson,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@

package com.google.devtools.build.lib.bazel.bzlmod;

import com.github.benmanes.caffeine.cache.Cache;
import com.github.benmanes.caffeine.cache.Caffeine;
import com.google.devtools.build.lib.bazel.repository.downloader.DownloadManager;
import java.net.URI;
import java.net.URISyntaxException;
Expand All @@ -25,6 +27,7 @@
public class RegistryFactoryImpl implements RegistryFactory {
private final DownloadManager downloadManager;
private final Supplier<Map<String, String>> clientEnvironmentSupplier;
private final Cache<String, Registry> registries = Caffeine.newBuilder().build();
fmeum marked this conversation as resolved.
Show resolved Hide resolved

public RegistryFactoryImpl(
DownloadManager downloadManager, Supplier<Map<String, String>> clientEnvironmentSupplier) {
Expand All @@ -43,7 +46,9 @@ public Registry getRegistryWithUrl(String url) throws URISyntaxException {
case "http":
case "https":
case "file":
return new IndexRegistry(uri, downloadManager, clientEnvironmentSupplier.get());
return registries.get(
url,
unused -> new IndexRegistry(uri, downloadManager, clientEnvironmentSupplier.get()));
default:
throw new URISyntaxException(uri.toString(), "Unrecognized registry URL protocol");
}
Expand Down
Loading