🐞 bug report
Affected Rule
The issue is caused by the rule: py_binary
Is this a regression?
Yes, the previous version in which this bug was not present was: 1.8.5
Description
We use pkg_tar (1.2.0) to package up py_binary and distribute it in some cases. Sometimes these go into docker images that rules_oci hashes. After upgrading to 2.0.1, I observed that the same py_binary is going into an image built in target or exec ends up with different hashes. I narrowed this down to this commit 352f405
Where build data is being included into the py_binary now via
"CONFIG_MODE": "EXEC" if _is_tool_config(ctx) else "TARGET",
so <target>.build_data.txt is different. This is the only difference between the built py_binaries.
I can't think of a usecase for knowing your build config mode at runtime in the py_binary, and it now makes the py_binary (and anything downstream of it) unreproducible between config and breaks path mapping as well. Can this be removed? Or if its needed for some reason, put behind stamp?
🔬 Minimal Reproduction
Build a py_binary and look in bazel-out for the build_data.txt.
🌍 Your Environment
Operating System:
Cent OS 9
5.14.0-687.el9.x86_64
Output of bazel version:
Rules_python version:
Anything else relevant?
I made this patch to fix it for us
Subject: [PATCH] Remove CONFIG_MODE from build data generation
Commit 352f405 added CONFIG_MODE to the build data generation environment,
which causes the build data file to differ between exec and target
configurations. This breaks build reproducibility and causes cache misses
when the same py_binary is built in both configurations.
Since the build data file is part of the output, having CONFIG_MODE in the
action environment means the same py_binary target produces different
outputs depending on whether it's in exec or target config, defeating Bazel's
caching and reproducibility guarantees.
This removes CONFIG_MODE from the environment to restore reproducibility.
---
python/private/py_executable.bzl | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/python/private/py_executable.bzl b/python/private/py_executable.bzl
index c2a4a8f8..c03fb2bf 100644
--- a/python/private/py_executable.bzl
+++ b/python/private/py_executable.bzl
@@ -1565,7 +1565,8 @@ def _write_build_data(ctx):
env = {
# Include config mode so that binaries can detect if they're
# being used as a build tool or not, allowing for runtime optimizations.
- "CONFIG_MODE": "EXEC" if _is_tool_config(ctx) else "TARGET",
+ # NOTE: Disabled to avoid cache misses between exec and target configs
+ # "CONFIG_MODE": "EXEC" if _is_tool_config(ctx) else "TARGET",
"INFO_FILE": info_file.path if info_file else "",
"OUTPUT": build_data.path,
# Include this so it's explicit, otherwise, one has to detect
--
2.52.0
🐞 bug report
Affected Rule
The issue is caused by the rule: py_binary
Is this a regression?
Yes, the previous version in which this bug was not present was: 1.8.5
Description
We use
pkg_tar(1.2.0) to package up py_binary and distribute it in some cases. Sometimes these go into docker images that rules_oci hashes. After upgrading to 2.0.1, I observed that the same py_binary is going into an image built in target or exec ends up with different hashes. I narrowed this down to this commit 352f405Where build data is being included into the py_binary now via
"CONFIG_MODE": "EXEC" if _is_tool_config(ctx) else "TARGET",so
<target>.build_data.txtis different. This is the only difference between the built py_binaries.I can't think of a usecase for knowing your build config mode at runtime in the py_binary, and it now makes the py_binary (and anything downstream of it) unreproducible between config and breaks path mapping as well. Can this be removed? Or if its needed for some reason, put behind stamp?
🔬 Minimal Reproduction
Build a py_binary and look in bazel-out for the build_data.txt.
🌍 Your Environment
Operating System:
Output of
bazel version:Rules_python version:
Anything else relevant?
I made this patch to fix it for us