Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Refactor
scripts/create_repository.py
, fix some issues (#1639)
* First create_repository.py refactoring Preserves the behavior of the previous version in preparation for further improvements and eventual behavior changes. Part of #1482. Tested by running the original script, calling `git stash third_party/...`, then running the updated script. Confirmed that the updated script produced no new changes in the working tree compared to the indexed changes. * Second create_repository.py refactoring Preserves the behavior of the previous version in preparation for further improvements and eventual behavior changes. Part of #1482. This change eliminates a lot of duplication, and replaces the previous functions to add trailing commas to the output with a `re.sub()` call. Tested by running the original script, calling `git stash third_party/...`, then running the updated script. Confirmed that the updated script produced no new changes in the working tree compared to the indexed changes. Also: - Added `#!/usr/bin/env python3` and set file mode 0755 to enable running the script directly instead of as an argument to `python3`. - Used `pathlib.Path` on `__file__` to enable running the script from any directory, not just from inside `scripts/`. - Updated `scripts/README.md` to reflect some of the script changes, and to improve the Markdown formatting in general. - Added the `DOWNLOADED_ARTIFACTS_FILE` variable, added its value to `.gitignore`, and added a `Path.unlink()` at the end of the script to clean it up. * Third create_repository.py refactoring Preserves the behavior of the previous version in preparation for further improvements and eventual behavior changes. Part of #1482. This vastly improves performance by _not_ downloading and hashing artifacts that are already present in the current configuration. Added `PROTOBUF_JAVA_VERSION = "4.28.2"` as a core artifact to avoid downgrades, and to signal the importance of this particular artifact. Tested by running the original script, calling `git stash third_party/...`, then running the updated script. Confirmed that the updated script produced no new changes in the working tree compared to the indexed changes, modulo the previous `protobuf-java` downgrade. * Small create_repository.py format updates, fixes Added missing traliling comma in `COORDINATE_GROUPS[0]`. Updated `get_label` to check for groups starting with `org.scala-lang.` Sorted `deps` in `to_rules_scala_compatible_dict` and always sets `deps` in the return value, even when empty. * Ensure Scala 2 libs get "_2" repo names in Scala 3 Without this change, it was possible for Scala 3 core library versions to become overwritten with Scala 2 versions specified as dependencies of other jars. Specifically, all the `@io_bazel_rules_scala_scala_*_2` deps explicitly added to the `scala_3_{1,2,3,4,5}.bzl` files in #1631 for Scalafmt get stripped of the `_2` suffix before this change. Also computed `is_scala_3` in `create_file` and passed it through where needed. At some point it might be worth refactoring the script into a proper object instead. * Bump Scalafmt to 3.8.3 in create_repository.py This matches the version set in #1631. * Prevent create_repository.py from downgrading jars The script will now only update an existing artifact if the newly resolved version is greater than that already in the third_party/repositories file. Part of #1482. Tested by running on `master` both before and after #1631. The script produced no new changes, and is even faster now, since it also won't try to downgrade existing artifact versions. More specifically, this change prevents the following downgrades: - scalap: from latest Scala version to an earlier version, including 2.13.14 to 2.13.11 - com.lihaoyi.pprint: from pprint_3:0.9.0 to pprint_2.13:0.6.4 - compiler-interface: 1.10.1 to 1.3.5 or 1.9.6 - com.lihaoyi.geny: from geny_3:1.1.1 to geny_2.13:0.6.5 * Ensure all artifact deps written to repo files Updates the main loop in `create_file` to iterate over the newly generated artifact dictionary, not the original dictionary parsed from the existing file. This ensures that all dependencies of the updated artifacts are written to the file. This also improves performance when rerunning, since the script will no longer keep trying to fetch the previously missing artifacts. Tested by running once, `git add`ing the results, and running again to ensure the second run was a no-op. * Update get_label, fix scala3 label bug Fixes a bug in the previous implementation whereby the `scala3_` components of org.scala-lang artifact labels listed in an artifact's `deps` weren't transformed to `scala_`. Improved the handling of org.scala-lang artifacts more generally via the new `adjust_scala_lang_label` function. Refactored the `COORDINATE_GROUPS` array into a series of separate `set` variables named after their `get_label` transformations. Makes the `get_label` implementation read a little more easily. * Replace `split_artifact_and_version` I finally noticed that `get_maven_coordinates` already did the same parsing, essentially. I migrated the comment and implementation from `split_artifact_and_version` into it. * Check against current ResolvedArtifact instances Updates `create_artifact_version_map` to `create_current_resolved_artifacts_map` to set up future improvements. Specifically, an existing artifact's dependency coordinates aren't updated unless the resolved artifact version is newer. A small change after this will ensure all existing dependency coordinates get updated. This could be a one-off change, or we could decide to keep it; either way, it would be a very small one to make. Also: - Added `MavenCoordinates.artifact_name` to replace `artifact_name_from_maven_coordinates`. - Renamed `is_newer_than_current_version` to `is_newer_version`. - Added logic in `create_artifact_metadata_map` to ensure values with `testonly` set to `True` aren't part of the automatic resolution process. * Handle scalap, org.thesamet.scalapb special cases Preserves the existing labels for these artifacts. Tested by running in a new branch that updates the direct dependencies of core artifacts. Prior to this change, the `scalap` repo and some of the `org.thesamet.scalapb` repos were duplicated with a new label. After this change, that duplication disappears. * Add --version, emit command, stderr only on error These are changes suggested by @WojciechMazur in #1639. * Raise and catch SubprocessError, return exit code Further embellishment of the suggestions from @WojciechMazur in #1639. This way we can see exactly which version updates encountered an error, and how many, while continuing to attempt to update other versions. * Hoist output_dir from create_or_update_repo_file Makes the logic slightly easier to follow. * Hoist OUTPUT_DIR as top level constant I still like passing it as an argument to `create_or_update_repository_file`. * Add --output_dir flag, add defaults to --help Making the output directory configurable from the command line was the next logical extension. This makes it possible to see what the script will generate from scratch without having to erase the existing repo files. Figured it would be nice to see the default values in the `--help` output. * Create a new empty file if no previous file exists Decided it might be better to literally start from scratch in `--output_dir` than trying to copy a file from `OUTPUT_DIR`. One could always copy files from `OUTPUT_DIR` before running the script if so desired. Also changed the `__main__` logic to exit immediately on `CreateRepositoryError` rather than attempt to keep going. * Remove redundant `file.exists()` check Eliminated the check in `create_or_update_repository_file` in favor of that in `copy_previous_version_or_create_new_file_if_missing`. * Improve `create_repository.py --help` docstrings * Refactor `get_label` in `create_repository.py` `get_label` now uses the `SPECIAL_CASE_GROUP_LABELS` dict instead of the `LAST_GROUP_COMPONENT_GROUP` and `NEXT_TO_LAST_GROUP_COMPONENT_GROUP` sets. Also added `com.google.guava` to `ARTIFACT_LABEL_ONLY_GROUPS` to generate the correct `io_bazel_rules_scala_guava` label. Added `com.google.api.grpc` and `dev.dirs.directories` to `SCALA_PROTO_RULES_GROUPS`. Updated indentation of `ARTIFACT_LABEL_ONLY_GROUPS` and `GROUP_AND_ARTIFACT_LABEL_GROUPS` elements. * Emit correct label for scala-collection-compat Specifically, `org_scala_lang_modules_scala_collection_compat`.
- Loading branch information