Skip to content

Commit

Permalink
Fix part of #1719, part of #3709: Add build stats CI workflow (#4092)
Browse files Browse the repository at this point in the history
## Explanation
Fixes part of #1719 and #3709

This PR introduces a new script & CI workflow for computing build stats
to compare both AABs and universal APKs between develop and the changes
in a given PR, as part of fixing #1719 (though this PR doesn't cover
everything outlined in that PR). This information is then detailed and
uploaded as a CI build artifact, and summarized & posted as a comment in
the PR. Some details included in the summary report:
- APK file/download size differences
- Method count differences
- Feature/permission differences
- New/removed resources & assets

The script supports computing differences for multiple "profiles" at the
same time, and the CI workflow has been set up to compute four:
1. dev
2. alpha
3. beta
4. GA

This workflow will be optional since it's very expensive to run (it has
to assemble 8 builds, 6 of which are Proguarded). It also doesn't really
need to be run in order to approve a PR, though reviewers may insist on
waiting for large or suspicious changes
(such as PRs introducing new dependencies) to ensure the actual affected
changes are as expected.

In order to mitigate this expense, the CI workflow runs on a scheduled
cron job off of develop across all open PRs and checks them in a group.
It runs at most once per day (based on
https://github.com/orgs/community/discussions/55768#discussioncomment-5941720)
so multiple changes to a PR will be picked up with a single check in the
next cron run. Currently, it will run even for a PR that hasn't changed
since the last run (but this is something that can be improved in the
future if it needs to be). It's being scheduled for 2:30am (02:30) UTC
which seems to have a few specific benefits:
- Per GitHub documentation, initiating the workflow outside the start of
the hour should reduce likelihood of cancellation (since the start of
the hour tends to use the most resources):
https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#schedule.
- This corresponds to 7:30pm PT, 2:30am GMT, 8:00am IST, 5:30am EAT, and
12:30pm AEST (just as a basis for a different part of the world). It's
actually a very nice time that shouldn't overlap with almost any
development in main locations around the world, so it hopefully won't
impact Oppia organization GitHub CI resources.

The example output from these workflows can be observed in a few places:
- Later in this PR (for back when the PR was configured to run the new
workflow per PR change).
- In #4261 which demonstrates the large math PRs and how they changed
the builds back before those were merged.
-
https://github.com/BenHenning/oppia-android/actions/workflows/stats.yml
and https://github.com/BenHenning/oppia-android/pulls (specifically:
BenHenning#14,
BenHenning#13,
BenHenning#12) which demonstrates
the workflow running correctly from a scheduled cron
(https://github.com/BenHenning/oppia-android/actions/runs/9232187176)
and posting the updates to open PRs.

Beyond that, implementing this utility involved several significant
changes to various systems, including the build graph:
- Three new utilities were added for the script: Aapt2Client,
ApkAnalyzerClient, and BundleToolClient. Each of these correspond to
Android CLI utilities, but each required special considerations:
- Aapt2Client requires direct access to the Android SDK, but fortunately
android_sdk_repository exposes this as a target so it's trivial to pass
it in & call it. Some build information is needed, too (see next outer
point).
- ApkAnalyzerClient couldn't use the apkanalyzer CLI contained within
the SDK since it's not exported by android_sdk_repository. Instead, we
needed to depend on the CLI's internal implementation library (which I
suspect is what Android Studio probably uses for its own APK Analyzer
tool). This required some new implementation.
- BundleToolClient fortunately can call right into the bundle tool
library that we use when building AABs, but unfortunately that tool
appears to not be designed to be called multiple times in the same
process. Because Java doesn't support forking, we actually needed to
fake a fork function by starting a new Java process using the current
process's classpath in order to re-run bundle tool for each needed
routine. Additionally, bundle tool required
https://github.com/oppia/archive-patcher (which needed new BUILD files
since it only supported Gradle building previously) and a non-Android
version of Guava (see below for the changes this has caused).
- A new build_vars.bzl was introduced to define the build SDK & build
tools versions (this is done in a way where they can actually be passed
to the new script's utilities since it needs to access aapt2).
- rules_kotlin had a bug where resources wouldn't be pulled in properly
for kt_jvm_library (see
bazelbuild/rules_kotlin#281), but this was
mitigated in a previous PR by upgrading rules_kotlin past alpha 2.
- The new functionality required the JRE-compatible version of Guava
(over the Android-constrained library used in the codebase today), but
this introduces a one-version issue. The solution ended up being
isolating the JRE-compatible Guava library to its own library with a
slightly hacky direct reference to it in BundleToolClient. Some of the
other attempts at solving this resulted in some Maven reference cleanups
in existing script documentation. This functionality will be improved in
downstream PRs, but other attempts that were originally made to isolate
this cleanly were:
- Introduce multiple maven_install files and isolate dependencies into:
production, tests, scripts. This has a number of nice benefits (more
correct licenses and faster Maven dependency fetches for production),
but it results in very tricky one-version violations for test targets
that cross dependencies between production and tests.
- Isolated maven_install just for scripts. This is closer to the
solution we'll want long-term, but it was too much complexity to fully
introduce in this PR so it's been reworked into a downstream PR that can
focus on cleaning up third-party dependency management across the whole
codebase.

This PR is introducing a few new dependencies that, in turn, pull in a
*bunch* of transitive dependencies. These are all due to the new
``apkanalyzer`` dependency. While it will affect licenses for this
specific PR, once third-party dependencies for scripts are cleaned up in
a downstream PR they will be moved out (since they are script-only
dependencies).

Separately, also note that the AAPT2 utility requires stdout to be
processed continuously in order for the process to finish. This was one
of the primary reasons CommandExecutorImpl was reworked in #4929.

For testing: most of the changes in this PR have been extensively
manually tested. However, the new utilities are lacking significant
automated tests. Since this utility is a nice-to-have for the rest of
the Bazel PR chain, it's being prioritized to be merged in spite of
lacking code coverage. #4971 has been filed to track adding these
missing tests in the long-term.

## Essential Checklist
- [x] The PR title and explanation each start with "Fix #bugnum: " (If
this PR fixes part of an issue, prefix the title with "Fix part of
#bugnum: ...".)
- [x] Any changes to
[scripts/assets](https://github.com/oppia/oppia-android/tree/develop/scripts/assets)
files have their rationale included in the PR explanation.
- [x] The PR follows the [style
guide](https://github.com/oppia/oppia-android/wiki/Coding-style-guide).
- [x] The PR does not contain any unnecessary code changes from Android
Studio
([reference](https://github.com/oppia/oppia-android/wiki/Guidance-on-submitting-a-PR#undo-unnecessary-changes)).
- [x] The PR is made from a branch that's **not** called "develop" and
is up-to-date with "develop".
- [x] The PR is **assigned** to the appropriate reviewers
([reference](https://github.com/oppia/oppia-android/wiki/Guidance-on-submitting-a-PR#clarification-regarding-assignees-and-reviewers-section)).

## For UI-specific PRs only
N/A -- This only affects CI workflows & the build system. Technically,
some dependency changes in the build system could have UI effects, but
there should be no such changes in this PR.

---------

Co-authored-by: Adhiambo Peres <[email protected]>
Co-authored-by: Sean Lip <[email protected]>
  • Loading branch information
3 people authored Jun 12, 2024
1 parent 4c7f811 commit 23def1b
Show file tree
Hide file tree
Showing 29 changed files with 4,039 additions and 74 deletions.
216 changes: 216 additions & 0 deletions .github/workflows/stats.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,216 @@
# Contains jobs corresponding to stats, including build stats due to changes in a PR.

name: Stats Checks & Reports

on:
workflow_dispatch:
schedule:
- cron: "30 02 * * *"

permissions:
pull-requests: write

jobs:
find_open_pull_requests:
name: Find open PRs
runs-on: ubuntu-20.04
outputs:
matrix: ${{ steps.compute-pull-request-matrix.outputs.matrix }}
env:
GH_TOKEN: ${{ github.token }}
steps:
- uses: actions/checkout@v4

- name: Compute PR matrix
id: compute-pull-request-matrix
# Remove spaces to ensure the matrix output is on one line. Reference:
# https://stackoverflow.com/a/3232433.
run: |
CURRENT_OPEN_PR_INFO="$(gh pr list --json number,baseRefName,headRefName,headRepository,headRepositoryOwner | tr -d '[:space:]')"
echo "matrix={\"prInfo\": $CURRENT_OPEN_PR_INFO}" >> "$GITHUB_OUTPUT"
build_stats:
name: Build Stats
needs: find_open_pull_requests
runs-on: ubuntu-20.04
# Reduce parallelization due to high build times, and allow individual PRs to fail.
strategy:
fail-fast: false
max-parallel: 5
matrix: ${{ fromJson(needs.find_open_pull_requests.outputs.matrix) }}
env:
ENABLE_CACHING: false
CACHE_DIRECTORY: ~/.bazel_cache
steps:
- name: Compute PR head owner/repo reference
env:
PR_HEAD_REPO: ${{ matrix.prInfo.headRepository.name }}
PR_HEAD_REPO_OWNER: ${{ matrix.prInfo.headRepositoryOwner.login }}
run: |
echo "PR_HEAD=$PR_HEAD_REPO_OWNER/$PR_HEAD_REPO" >> "$GITHUB_ENV"
- name: Print PR information for this run
env:
PR_BASE_REF_NAME: ${{ matrix.prInfo.baseRefName }}
PR_HEAD_REF_NAME: ${{ matrix.prInfo.headRefName }}
PR_NUMBER: ${{ matrix.prInfo.number }}
run: |
echo "PR $PR_NUMBER is merging into $PR_BASE_REF_NAME from https://github.com/$PR_HEAD branch $PR_HEAD_REF_NAME."
- name: Set up JDK 9
uses: actions/setup-java@v1
with:
java-version: 9

- name: Set up Bazel
uses: abhinavsingh/setup-bazel@v3
with:
version: 4.0.0

# For reference on this & the later cache actions, see:
# https://github.com/actions/cache/issues/239#issuecomment-606950711 &
# https://github.com/actions/cache/issues/109#issuecomment-558771281. Note that these work
# with Bazel since Bazel can share the most recent cache from an unrelated build and still
# benefit from incremental build performance (assuming that actions/cache aggressively removes
# older caches due to the 5GB cache limit size & Bazel's large cache size).
- uses: actions/cache@v2
id: cache
with:
path: ${{ env.CACHE_DIRECTORY }}
key: ${{ runner.os }}-${{ env.CACHE_DIRECTORY }}-bazel-binary-${{ github.sha }}
restore-keys: |
${{ runner.os }}-${{ env.CACHE_DIRECTORY }}-bazel-binary-
${{ runner.os }}-${{ env.CACHE_DIRECTORY }}-bazel-
# This check is needed to ensure that Bazel's unbounded cache growth doesn't result in a
# situation where the cache never updates (e.g. due to exceeding GitHub's cache size limit)
# thereby only ever using the last successful cache version. This solution will result in a
# few slower CI actions around the time cache is detected to be too large, but it should
# incrementally improve thereafter.
- name: Ensure cache size
env:
BAZEL_CACHE_DIR: ${{ env.CACHE_DIRECTORY }}
run: |
# See https://stackoverflow.com/a/27485157 for reference.
EXPANDED_BAZEL_CACHE_PATH="${BAZEL_CACHE_DIR/#\~/$HOME}"
CACHE_SIZE_MB=$(du -smc $EXPANDED_BAZEL_CACHE_PATH | grep total | cut -f1)
echo "Total size of Bazel cache (rounded up to MBs): $CACHE_SIZE_MB"
# Use a 4.5GB threshold since actions/cache compresses the results, and Bazel caches seem
# to only increase by a few hundred megabytes across changes for unrelated branches. This
# is also a reasonable upper-bound (local tests as of 2021-03-31 suggest that a full build
# of the codebase (e.g. //...) from scratch only requires a ~2.1GB uncompressed/~900MB
# compressed cache).
if [[ "$CACHE_SIZE_MB" -gt 4500 ]]; then
echo "Cache exceeds cut-off; resetting it (will result in a slow build)"
rm -rf $EXPANDED_BAZEL_CACHE_PATH
fi
- name: Configure Bazel to use a local cache
env:
BAZEL_CACHE_DIR: ${{ env.CACHE_DIRECTORY }}
run: |
EXPANDED_BAZEL_CACHE_PATH="${BAZEL_CACHE_DIR/#\~/$HOME}"
echo "Using $EXPANDED_BAZEL_CACHE_PATH as Bazel's cache path"
echo "build --disk_cache=$EXPANDED_BAZEL_CACHE_PATH" >> $HOME/.bazelrc
shell: bash

# This checks out the actual true develop branch separately to ensure that the stats check is
# run from the latest develop rather than the base branch (which might be different for
# chained PRs).
- name: Check out develop repository
uses: actions/checkout@v4
with:
path: develop

- name: Set up build environment
uses: ./develop/.github/actions/set-up-android-bazel-build-environment

- name: Check Bazel environment
run: |
cd develop
bazel info
- name: Check out base repository and branch
env:
PR_BASE_REF_NAME: ${{ matrix.prInfo.baseRefName }}
uses: actions/checkout@v4
with:
fetch-depth: 0
ref: ${{ env.PR_BASE_REF_NAME }}
path: base

- name: Check out head repository and branch
env:
PR_HEAD_REF_NAME: ${{ matrix.prInfo.headRefName }}
uses: actions/checkout@v4
with:
fetch-depth: 0
repository: ${{ env.PR_HEAD }}
ref: ${{ env.PR_HEAD_REF_NAME }}
path: head

# Note that Bazel is shutdown between builds since multiple Bazel servers will otherwise end
# up being active (due to multiple repositories being used) and this can quickly overwhelm CI
# worker resources.
- name: Build Oppia dev, alpha, beta, and GA (feature branch)
run: |
cd head
git log -n 1
bazel build -- //:oppia_dev //:oppia_alpha //:oppia_beta //:oppia_ga
cp bazel-bin/oppia_dev.aab ../develop/oppia_dev_with_changes.aab
cp bazel-bin/oppia_alpha.aab ../develop/oppia_alpha_with_changes.aab
cp bazel-bin/oppia_beta.aab ../develop/oppia_beta_with_changes.aab
cp bazel-bin/oppia_ga.aab ../develop/oppia_ga_with_changes.aab
bazel shutdown
- name: Build Oppia dev, alpha, beta, and GA (base branch)
run: |
cd base
git log -n 1
bazel build -- //:oppia_dev //:oppia_alpha //:oppia_beta //:oppia_ga
cp bazel-bin/oppia_dev.aab ../develop/oppia_dev_without_changes.aab
cp bazel-bin/oppia_alpha.aab ../develop/oppia_alpha_without_changes.aab
cp bazel-bin/oppia_beta.aab ../develop/oppia_beta_without_changes.aab
cp bazel-bin/oppia_ga.aab ../develop/oppia_ga_without_changes.aab
bazel shutdown
- name: Run stats analysis tool (develop branch)
run: |
cd develop
git log -n 1
bazel run //scripts:compute_aab_differences -- \
$(pwd)/brief_build_summary.log $(pwd)/full_build_summary.log \
dev $(pwd)/oppia_dev_without_changes.aab $(pwd)/oppia_dev_with_changes.aab \
alpha $(pwd)/oppia_alpha_without_changes.aab $(pwd)/oppia_alpha_with_changes.aab \
beta $(pwd)/oppia_beta_without_changes.aab $(pwd)/oppia_beta_with_changes.aab \
ga $(pwd)/oppia_ga_without_changes.aab $(pwd)/oppia_ga_with_changes.aab
# Reference: https://github.com/peter-evans/create-or-update-comment#setting-the-comment-body-from-a-file.
# Also, for multi-line env values, see: https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions#multiline-strings.
- name: Extract reports for uploading & commenting
env:
PR_NUMBER: ${{ matrix.prInfo.number }}
id: compute-comment-body
run: |
{
echo 'comment_body<<EOF'
cat $GITHUB_WORKSPACE/develop/brief_build_summary.log
echo EOF
} >> "$GITHUB_OUTPUT"
FULL_BUILD_SUMMARY_FILE_NAME="full_build_summary_pr_$PR_NUMBER.log"
FULL_BUILD_SUMMARY_FILE_PATH="$GITHUB_WORKSPACE/develop/$FULL_BUILD_SUMMARY_FILE_NAME"
echo "FULL_BUILD_SUMMARY_FILE_NAME=$FULL_BUILD_SUMMARY_FILE_NAME" >> "$GITHUB_ENV"
echo "FULL_BUILD_SUMMARY_FILE_PATH=$FULL_BUILD_SUMMARY_FILE_PATH" >> "$GITHUB_ENV"
cp "$GITHUB_WORKSPACE/develop/full_build_summary.log" "$FULL_BUILD_SUMMARY_FILE_PATH"
- name: Add build stats summary comment
env:
PR_NUMBER: ${{ matrix.prInfo.number }}
uses: peter-evans/create-or-update-comment@v1
with:
issue-number: ${{ env.PR_NUMBER }}
body: ${{ steps.compute-comment-body.outputs.comment_body }}

- uses: actions/upload-artifact@v2
with:
name: ${{ env.FULL_BUILD_SUMMARY_FILE_NAME }}
path: ${{ env.FULL_BUILD_SUMMARY_FILE_PATH }}
41 changes: 28 additions & 13 deletions WORKSPACE
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,16 @@ This file lists and imports all external dependencies needed to build Oppia Andr

load("@bazel_tools//tools/build_defs/repo:git.bzl", "git_repository")
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive", "http_jar")
load("//:build_vars.bzl", "BUILD_SDK_VERSION", "BUILD_TOOLS_VERSION")
load("//third_party:versions.bzl", "HTTP_DEPENDENCY_VERSIONS", "MAVEN_REPOSITORIES", "get_maven_dependencies")

# Android SDK configuration. For more details, see:
# https://docs.bazel.build/versions/master/be/android.html#android_sdk_repository
# TODO(#1542): Sync Android SDK version with the manifest.
android_sdk_repository(
name = "androidsdk",
api_level = 33,
build_tools_version = "29.0.2",
api_level = BUILD_SDK_VERSION,
build_tools_version = BUILD_TOOLS_VERSION,
)

# Oppia's backend proto API definitions.
Expand Down Expand Up @@ -160,6 +161,13 @@ git_repository(
shallow_since = "1679426649 -0700",
)

git_repository(
name = "archive_patcher",
commit = "d1c18b0035d5f669ddaefadade49cae0748f9df2",
remote = "https://github.com/oppia/archive-patcher",
shallow_since = "1642022460 -0800",
)

bind(
name = "databinding_annotation_processor",
actual = "//tools/android:compiler_annotation_processor",
Expand Down Expand Up @@ -214,17 +222,24 @@ load("@maven//:defs.bzl", "pinned_maven_install")

pinned_maven_install()

http_jar(
name = "guava_android",
sha256 = HTTP_DEPENDENCY_VERSIONS["guava_android"]["sha"],
urls = [
"{0}/com/google/guava/guava/{1}-android/guava-{1}-android.jar".format(
url_base,
HTTP_DEPENDENCY_VERSIONS["guava_android"]["version"],
)
for url_base in DAGGER_REPOSITORIES + MAVEN_REPOSITORIES
],
)
[
http_jar(
name = "guava_%s" % guava_type,
sha256 = HTTP_DEPENDENCY_VERSIONS["guava_%s" % guava_type]["sha"],
urls = [
"{0}/com/google/guava/guava/{1}-{2}/guava-{1}-{2}.jar".format(
url_base,
HTTP_DEPENDENCY_VERSIONS["guava_%s" % guava_type]["version"],
guava_type,
)
for url_base in DAGGER_REPOSITORIES + MAVEN_REPOSITORIES
],
)
for guava_type in [
"android",
"jre",
]
]

http_jar(
name = "kotlinx-coroutines-core-jvm",
Expand Down
2 changes: 2 additions & 0 deletions build_vars.bzl
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
BUILD_SDK_VERSION = 33
BUILD_TOOLS_VERSION = "29.0.2"
4 changes: 2 additions & 2 deletions oppia_android_application.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -280,7 +280,7 @@ _bundle_module_zip_into_deployable_aab = rule(
"_bundletool_tool": attr.label(
executable = True,
cfg = "host",
default = "//third_party:android_bundletool",
default = "//third_party:android_bundletool_binary",
),
},
implementation = _bundle_module_zip_into_deployable_aab_impl,
Expand Down Expand Up @@ -316,7 +316,7 @@ _generate_apks_and_install = rule(
"_bundletool_tool": attr.label(
executable = True,
cfg = "host",
default = "//third_party:android_bundletool",
default = "//third_party:android_bundletool_binary",
),
},
executable = True,
Expand Down
10 changes: 10 additions & 0 deletions scripts/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,16 @@ package_group(
packages = ["//scripts/src/java/..."],
)

kt_jvm_binary(
name = "compute_aab_differences",
testonly = True,
data = ["@androidsdk//:aapt2_binary"],
main_class = "org.oppia.android.scripts.apkstats.ComputeAabDifferencesKt",
runtime_deps = [
"//scripts/src/java/org/oppia/android/scripts/apkstats:compute_aab_differences_lib",
],
)

kt_jvm_binary(
name = "compute_affected_tests",
testonly = True,
Expand Down
69 changes: 69 additions & 0 deletions scripts/src/java/org/oppia/android/scripts/apkstats/Aapt2Client.kt
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
package org.oppia.android.scripts.apkstats

import org.oppia.android.scripts.common.CommandExecutor
import org.oppia.android.scripts.common.CommandExecutorImpl
import org.oppia.android.scripts.common.ScriptBackgroundCoroutineDispatcher
import java.io.File

/**
* General utility for interfacing with AAPT2 in the local system at the specified working directory
* path and contained within the specified Android SDK (per the given path).
*
* Note that in order for binary dependencies to utilize this client, they must add a 'data'
* dependency on the AAPT2 binary included as part of the Android SDK, e.g.:
*
* ```bazel
* data = ["@androidsdk//:aapt2_binary"]
* ```
*
* @property workingDirectoryPath the path to the working directory in which instances of AAPT2
* should be executed
* @property buildToolsVersion the version of Android build tools installed & that should be used.
* This value should be coordinated with the build system used by the APKs accessed by this
* utility.
* @param scriptBgDispatcher the [ScriptBackgroundCoroutineDispatcher] to be used for running the
* AAPT2 command
* @property commandExecutor the [CommandExecutor] to use when accessing AAPT2
*/
class Aapt2Client(
private val workingDirectoryPath: String,
private val buildToolsVersion: String,
scriptBgDispatcher: ScriptBackgroundCoroutineDispatcher,
private val commandExecutor: CommandExecutor = CommandExecutorImpl(scriptBgDispatcher)
) {
private val workingDirectory by lazy { File(workingDirectoryPath) }
// Note that this pathing will not work by default on Windows (since executables end with '.exe').
private val aapt2Path by lazy {
File("external/androidsdk", "build-tools/$buildToolsVersion/aapt2").absolutePath
}

// CLI reference: https://developer.android.com/studio/command-line/apkanalyzer.

/** Returns the permissions dump as reported by AAPT2 for the specified APK. */
fun dumpPermissions(inputApkPath: String): List<String> {
return executeApkAnalyzerCommand("dump", "permissions", inputApkPath)
}

/** Returns the resources dump as reported by AAPT2 for the specified APK. */
fun dumpResources(inputApkPath: String): List<String> {
return executeApkAnalyzerCommand("dump", "resources", inputApkPath)
}

/**
* Returns badging information, that is, high-level details like supported locales and densities,
* from the specified APK's manifest.
*/
fun dumpBadging(inputApkPath: String): List<String> {
return executeApkAnalyzerCommand("dump", "badging", inputApkPath)
}

private fun executeApkAnalyzerCommand(vararg arguments: String): List<String> {
val result = commandExecutor.executeCommand(workingDirectory, aapt2Path, *arguments)
check(result.exitCode == 0) {
"Expected zero exit code (not ${result.exitCode}) for command: ${result.command}." +
"\nStandard output:\n${result.output.joinToString("\n")}" +
"\nError output:\n${result.errorOutput.joinToString("\n")}"
}
return result.output
}
}
Loading

0 comments on commit 23def1b

Please sign in to comment.