Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huge amount of time spent "Writing repo mapping manifest" with remote execution #23927

Closed
keith opened this issue Oct 9, 2024 · 5 comments
Closed

Comments

@keith
Copy link
Member

keith commented Oct 9, 2024

Description of the bug:

In one of our builds, testing a py_test target, we saw "Writing repo mapping manifest" for the target in the Action.execute stage for 7 minutes in the chrome trace produced by bazel.

In this target's case the _repo_mapping file when tested locally is 6304 lines long primarily caused by pypi dependencies such as:

rules_python~~pip~pip_deps_311_requests,pip_deps_311_regex,rules_python~~pip~pip_deps_311_regex

Only 49 entries are not this type of target, but there are "only" 78 transitive pip deps in the tree, so it's not that this specific test target is enormous from our side.

Which category does this issue belong to?

No response

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

No response

Which operating system are you running Bazel on?

Linux

What is the output of bazel info release?

33a2025 (7.4 release branch pre-rc)

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?

No response

If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

@keith
Copy link
Member Author

keith commented Oct 9, 2024

I was wrong about the time spent here

@keith keith closed this as completed Oct 9, 2024
@keith
Copy link
Member Author

keith commented Oct 9, 2024

my original numbers were wrong here, but sounds like others have specifics

@keith keith reopened this Oct 9, 2024
@Wyverald
Copy link
Member

Wyverald commented Oct 9, 2024

closing as this is inactionable right now; when the others with specifics post them here, we can reopen :P

@Wyverald Wyverald closed this as not planned Won't fix, can't repro, duplicate, stale Oct 9, 2024
@DavidZbarsky-at
Copy link

Sorry, I confused myself twice. Here's the real stats:

A typical _repo_mapping for a JS test contains 4435 lines. Of those:

  • 629 are created by rules_rust
  • 96 by rules_jvm_external
  • 3565 by rules_js

I also noticed something that doesn't quite make sense to me, but perhaps it's related to what @fmeum was thinking about:

(env-18.16.0) david.zbarsky@JF6FQ9PXP9 hyperbase % cat /private/var/tmp/_bazel_david.zbarsky/5cb4fae43cf358e621837c0f9b3546d5/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/integration_tests/test_files/_support_panel/revoke_expired_access_to_customer_content_cron.test.tsx.repo_mapping | grep aws-in-a-box                    
,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~bcrypt,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~com_github_redis_redis,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~com_google_absl,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~envoy-1_23_1-macos13-arm64,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~fsevents,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~msgpackr-extract,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~mysql_macos14-arm64,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~nice-napi,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~node-lzo,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~node-re2,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~node-segfault-handler,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~node-unix-dgram,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~nodejs-addon-api,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~nodejs_nan,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~pdftron-pdfnet-node,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~pdftron-pdfnet-node-darwin-arm64,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~re2,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~sqlite4java,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~sysroot_darwin,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~v8-profiler-next,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~xenova-all-minilm-l6-v2,aws-in-a-box,aws-in-a-box~
_main~_repo_rules~xenova-transformers,aws-in-a-box,aws-in-a-box~
aws-in-a-box~,aws-in-a-box,aws-in-a-box~

The bit that is odd to me is that aws-in-a-box is a top-level dep for us, and I don't think any other module relies on it. So it appears that any external repo created through use_repo_rule gets visibility to anything else in the root module? (Not sure if that's expected). That seems like it could potentially cause a lot of extra entries as well for folks who use a lot of http_archive/http_file (though we only have a handful so it's not too bad).

@fmeum
Copy link
Collaborator

fmeum commented Oct 10, 2024

Yes, use_repo_rule behaves just like a module extension owned by the module that calls it, so each repo visible to the module is also visible to the use_repo_rule repo. It's exactly the same fundamental problem with _repo_mapping's current format.

We could fix this by adding a single line _main~_repo_rules~*,aws-in-a-box,aws-in-a-box~ instead. The * is not a valid character in canonical repo names, so a runfiles library could unambiguously understand this as a prefix rather than an exact match. We would then need an incompatible flag to stop generating the old entries.

But since the repo mapping manifest in its current form should compress extremely well and is still not that large (probably single-digit MiBs?), this doesn't seem to be a pressing issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants