Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce code coverage analysis in CI #10180

Merged
merged 26 commits into from
Nov 20, 2023
Merged
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
ee7b9e5
introduce code coverage analysis in CI
Ekleog-NEAR Nov 15, 2023
f35de0e
try workspace cleaning in case it’s the thing breaking it all
Ekleog-NEAR Nov 15, 2023
82ffa9f
try removing llvm_profile_file from nextest job
Ekleog-NEAR Nov 15, 2023
5c21b41
remove export-prefix
Ekleog-NEAR Nov 15, 2023
4653fd5
try with proper github_env syntax
Ekleog-NEAR Nov 15, 2023
2a739fd
use simpler way of setting the env
Ekleog-NEAR Nov 15, 2023
bd4812a
add debug target listing
Ekleog-NEAR Nov 15, 2023
d1ad35f
try with llvm profile file override again
Ekleog-NEAR Nov 15, 2023
3ff8d46
remove llvm_profile_file trick
Ekleog-NEAR Nov 16, 2023
b089cdf
let cargo llvm-cov find the right cargo profile
Ekleog-NEAR Nov 16, 2023
b7571f9
try using what seems to work everywhere
Ekleog-NEAR Nov 16, 2023
41d72b8
debug what’s going wrong with pytests
Ekleog-NEAR Nov 16, 2023
dc548a1
actually delete all profraw files
Ekleog-NEAR Nov 16, 2023
e7c20cd
disable integration-tests for macos at the right place
Ekleog-NEAR Nov 16, 2023
557baed
try creating the target folder early?
Ekleog-NEAR Nov 16, 2023
d4c5a79
try not using pytest as workdir
Ekleog-NEAR Nov 16, 2023
41d555c
and with creating target folder?
Ekleog-NEAR Nov 16, 2023
4ad3102
try uploading the whole target folder
Ekleog-NEAR Nov 16, 2023
a974887
compress target before uploading
Ekleog-NEAR Nov 16, 2023
6705812
do not upload whole target folder
Ekleog-NEAR Nov 16, 2023
ae63426
discard preparatory build neard job
Ekleog-NEAR Nov 16, 2023
edccf7c
cleanup test lines
Ekleog-NEAR Nov 16, 2023
f1fcc7c
review comments
Ekleog-NEAR Nov 17, 2023
a1c1976
upload with codecov token to avoid intermittent failure to autodetect…
Ekleog-NEAR Nov 20, 2023
0bf0eac
Merge remote-tracking branch 'upstream/master' into coverage
Ekleog-NEAR Nov 20, 2023
c3d6ab5
Merge remote-tracking branch 'upstream/master' into coverage
Ekleog-NEAR Nov 20, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
196 changes: 132 additions & 64 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,24 +9,6 @@ on:
merge_group:

jobs:
build_binary:
name: "Build neard"
runs-on: ubuntu-22.04-16core
steps:
- uses: actions/checkout@v4
- uses: Swatinem/rust-cache@a95ba195448af2da9b00fb742d14ffaaf3c21f43
with:
prefix-key: "0" # change this to invalidate CI cache
shared-key: "cargo_nextest-linux"
save-if: "false" # use the cache from nextest, but don’t double-save
- run: cargo build --locked --profile dev-release -p neard --bin neard
- uses: actions/upload-artifact@v3
with:
name: neard
path: target/dev-release/neard
if-no-files-found: error
retention-days: 1

cargo_nextest:
name: "Cargo Nextest (${{matrix.name}})"
runs-on: ${{ matrix.os }}
Expand All @@ -35,39 +17,79 @@ jobs:
matrix:
include:
- name: Linux
id: linux
cache_id: linux
os: ubuntu-22.04-16core
flags: ""
run_integ_tests: true
- name: Linux Nightly
id: linux-nightly
cache_id: linux
os: ubuntu-22.04-16core
flags: "--features nightly,test_features"
run_integ_tests: true
- name: MacOS
id: macos
cache_id: macos
os: macos-latest-xlarge
# FIXME: some of these tests don't work very well on MacOS at the moment. Should fix
# them at earliest convenience :)
flags: "--exclude integration-tests --exclude node-runtime --exclude runtime-params-estimator --exclude near-network --exclude estimator-warehouse"
flags: "--exclude node-runtime --exclude runtime-params-estimator --exclude near-network --exclude estimator-warehouse"
run_integ_tests: false
timeout-minutes: 90
steps:
- uses: actions/checkout@v4

# Install all the required tools
- uses: baptiste0928/cargo-install@21a18ba3bf4a184d1804e8b759930d3471b1c941
with:
crate: cargo-nextest
- uses: baptiste0928/cargo-install@21a18ba3bf4a184d1804e8b759930d3471b1c941
with:
crate: cargo-deny
- uses: baptiste0928/cargo-install@21a18ba3bf4a184d1804e8b759930d3471b1c941
with:
crate: cargo-llvm-cov

# Setup the dependency rust cache and llvm-cov
- uses: Swatinem/rust-cache@a95ba195448af2da9b00fb742d14ffaaf3c21f43
with:
prefix-key: "0" # change this to invalidate CI cache
shared-key: "cargo_nextest-${{ matrix.cache_id }}"
- run: cargo nextest run --locked --workspace -p '*' --cargo-profile dev-release --profile ci ${{ matrix.flags }}
- run: cargo llvm-cov show-env | tr -d "'" >> "$GITHUB_ENV"

# Run unit tests
- run: cargo nextest run --locked --workspace --exclude integration-tests --cargo-profile dev-release --profile ci ${{ matrix.flags }}
env:
RUST_BACKTRACE: short
- run: cargo llvm-cov report --profile dev-release --codecov --output-path unittests.json
- uses: codecov/codecov-action@eaaf4bedf32dbdc6b720b63067d99c4d77d6047d
with:
token: ${{ secrets.CODECOV_TOKEN }}
files: unittests.json
fail_ci_if_error: true
flags: unittests,${{ matrix.id }}
# See https://github.com/taiki-e/cargo-llvm-cov/issues/292
- run: find target -name '*.profraw' -delete

# Run integration tests
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmph, splitting these nextest run lines goes in opposition to #9415. If its only serving to provide some sort of categorization within the tool, I would personally much rather keep the single command option.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I think #9415 should be solved by using something like just test, not by having everything stuffed into nextest, among other reasons because if we did it we could move clippy out and have a cache for it, not needing to rebuild it from scratch each time :)

Also, this split doesn’t change the fact that cargo nextest run still runs all the tests and works fine. I do think the categorization within the tool is useful, especially considering yesterday’s discussion that debugging integration tests is very bad dev UX, and thus we’d mostly be interested in the code coverage for unit-tests only.

Copy link
Collaborator

@nagisa nagisa Nov 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, so back in buildkite days we also had "cargo nextest run", but there were like 5 different ways to run integration tests and you had to pay really close attention to which specific invocation was failing – one of them failing would usually mean the others were passing.

This was exactly what prompted me to start working on CI in the first place. My honest evaluation is that we got into that state back then precisely because it was normal and natural to "just add another invocation of cargo run/test/nextest/etc with my snowflake combination of flags".

Having a single cargo nextest invocation I believe does a great job pushing against such configurations (and I hope it will eventually push enough to obviate the need for nightly vs. non-nightly configurations as well.) Splitting invocations into parts goes back towards normalizing having more of the snowflake configurations.

Now, to make this actionable at all – I wonder if nextest could be made to export test coverage on a per-test basis. All of the tests are distinct binary invocations, which should allow us to produce coverage files for each individual test (and therefore could then eventually be filtered by e.g. crate, module, etc. after the fact), not just two broad categories of unit vs integration tests.

Unfortunately cargo llvm-cov nextest appears to just run cargo nextest after setting up the environment, but perhaps target runner scripts could be used as a mechanism to isolate and produce coverage information for each individual test (there’s even a section on debug output in the documentation?)

Copy link
Collaborator

@nagisa nagisa Nov 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could move clippy out and have a cache for it

I don’t see why we couldn’t have a cache right now. We just need to also cache the target/style directory (does rust-cache not do it…?)

FWIW my belief is that the current setup with target/style is actually better for local development purposes as developers no longer risk accidentally invalidating their cached proper build when they run cargo clippy -p foo or something of the sort (by the virtue of not needing to run cargo clippy manually at all.) Though now that I write this, I realized there’s a bug in how style tests set the TARGET_DIR – it should be different for each test that builds code (in particular themis should have its own TARGET_DIR so it does not get blocked waiting on the target directory lock…)

Copy link
Collaborator Author

@Ekleog-NEAR Ekleog-NEAR Nov 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(does rust-cache not do it…?)

Considering the clippy test is always among the last ones to complete and is written as slow (so 120+ seconds), I’d guess it doesn’t do it.

Counter-proposal: what if we had a Justfile with:

  • A rule for unit tests
  • A rule for clippy that runs it with target_dir set to target/style
  • A rule for integration tests
  • A just test rule that runs all of them

Then developers could run just test, and CI could run the rules independently, and maybe even override the target folder depending on whether we can fix target folder caching some other way. We could also easily add the nightly configuration into the Justfile, so that there’s indeed a single command line to run. WDYT?

(I think changes to llvm-cov or cargo nextest might be a great idea as there are already many profraw files and we’d "just" need to figure out which are the right ones; but will not be reasonably doable before november 26, especially as I’m OOO the second half of next week and have a lot of other stuff to do by then)

(Also, fwiw, I personally don’t really care about having a single command line to run right now: it’s much faster for me to upload to CI and wait for the results, than to test locally, currently, but that’s part of the devenv issues we’re discussing on zulip right now, and you’re right that having a single command would in theory be good)

Copy link
Collaborator Author

@Ekleog-NEAR Ekleog-NEAR Nov 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, relatedly, in order to be able to use the github actions for eg. lychee (that I’m introducing in #10217), we’ll need to run the tests outside of cargo nextest anyway, unless we want to check them twice in CI. In particular, we could start using the clippy action, which AFAIU has much nicer error reporting than our current "manually read the logs" system.

So I think this confirms the want for a tool like just, so that we could have just test run lychee in a way similar to what is being run by CI :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, so the weekend gave me ample of time to consider the last message. I agree that having nice reporting within GitHub is really nice.

But I also see a major (I think) flaw in what I think you’re proposing we should do. To double check, are you suggesting that we maintain .github/workflows/ci.yaml and a Justfile and that they would each implement the same checks independently? e.g. where Justfile would run cargo clippy, the GHA would run the clippy action? I think doing so is precisely the mistake I would like to avoid. If we’re adding a Justfile, the CI could run exactly the just commands that ought to be run as part of a local development flow (and ideally that command is singular or is split up into sub-jobs programatically) but that does not come with the nice error reporting benefits.

But unless we eschew those benefits, I don’t see how can we ensure, especially in the long term, that the checks run via the justfile remain synchronized with those run within GHA. And that means that ultimately we would still sometimes see instances where a successful just pass does not guarantee the CI passing (or vice versa, for that matter.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My current goal is indeed to have the CI run just nextest-unit && coverage && just nextest-integration && coverage, where just nextest for local use would also run the two.

However, for clippy, in order to get the nice reporting, I would indeed have a just clippy for local development that would not be DRY with the clippy-action for github, because by definition clippy-action cannot know to use our Justfile (or literally any other way to run all the tests we could have), any more than just could automatically figure out the command clippy-action runs.

So I’d say that by definition, I’d say that we can only have two of the following three:

  • nice error reporting
  • not implementing error reporting ourselves
  • locally-run command is DRY with remotely-run command

I also personally think we should choose the first two, and whenever we come across an instance where just and the CI workflow don’t give the same result we can consider that as a bug and treat it by probably just copy-pasting again from the updated upstream clippy-action.

That being said, this PR is only about nextest splitting, so I’d argue that the discussion about clippy lints could be postponed for the day when we actually consider using clippy-action :) And it seems to me like you don’t disagree with the idea, if restricted to splitting the nextest step into two only?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That being said, this PR is only about nextest splitting, so I’d argue that the discussion about clippy lints could be postponed for the day when we actually consider using clippy-action :) And it seems to me like you don’t disagree with the idea, if restricted to splitting the nextest step into two only?

I mean… whatever plan we come up and start following should scale to whatever future ambitions we may have, which is why I think discussing clippy and lychee is also important… In particular one other thing I would (eventually, not necessarily in this PR) love for the setup to have is automatic determination of which just commands to run for the CI.

This could be something similar to the wasmtime’s determine job which generates a job matrix for the successive jobs. In our case we’d parse something along the lines of just --summary and generate a job matrix consisting of all the individual subcommands to run across different builders.

One thing I already like about just setup is that it also allows for maintaining the different flag combinations for different OSes (so we can move --ignore integration-tests on macos down into the justfile.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular one other thing I would (eventually, not necessarily in this PR) love for the setup to have is automatic determination of which just commands to run for the CI.

While I’m philosophically of the same opinion, at the same time I’m wondering whether the time to implement and the effort to maintain such a determination are financially worth the savings in compute time 😅 but it’d be really nice to have that!

- run: cargo nextest run --locked --package integration-tests --cargo-profile dev-release --profile ci ${{ matrix.flags }}
if: matrix.run_integ_tests
env:
RUST_BACKTRACE: short
- run: cargo llvm-cov report --profile dev-release --codecov --output-path integration-tests.json
if: matrix.run_integ_tests
- uses: codecov/codecov-action@eaaf4bedf32dbdc6b720b63067d99c4d77d6047d
if: matrix.run_integ_tests
with:
token: ${{ secrets.CODECOV_TOKEN }}
files: integration-tests.json
nagisa marked this conversation as resolved.
Show resolved Hide resolved
fail_ci_if_error: true
flags: integration-tests,${{ matrix.id }}

protobuf_backward_compat:
name: "Protobuf Backward Compatibility"
runs-on: ubuntu-latest
runs-on: ubuntu-22.04-8core
steps:
- uses: actions/checkout@v4
- uses: bufbuild/buf-setup-action@1158f4fa81bc02e1ff62abcca6d516c9e24c77da
Expand All @@ -77,48 +99,64 @@ jobs:

py_backward_compat:
name: "Backward Compatibility"
needs: build_binary
runs-on: ubuntu-22.04
defaults:
run:
working-directory: ./pytest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v4
with:
python-version: 3.11
cache: pip
- uses: actions/download-artifact@v3
- uses: baptiste0928/cargo-install@21a18ba3bf4a184d1804e8b759930d3471b1c941
with:
crate: cargo-llvm-cov
- uses: Swatinem/rust-cache@a95ba195448af2da9b00fb742d14ffaaf3c21f43
with:
prefix-key: "0" # change this to invalidate CI cache
shared-key: "cargo_nextest-linux"
save-if: "false" # use the cache from nextest, but don’t double-save
- run: pip3 install --user -r pytest/requirements.txt
- run: cargo llvm-cov show-env | tr -d "'" >> "$GITHUB_ENV"
- run: cargo build --locked --profile dev-release -p neard --bin neard
- run: echo "CURRENT_NEARD=$PWD/target/dev-release/neard" >> "$GITHUB_ENV"
- run: cd pytest && python3 tests/sanity/backward_compatible.py
- run: cargo llvm-cov report --profile dev-release --codecov --output-path pytest-backcomp.json
- uses: codecov/codecov-action@eaaf4bedf32dbdc6b720b63067d99c4d77d6047d
with:
name: neard
path: pytest # NB: this does not account for defaults.run.working-directory
- run: echo "CURRENT_NEARD=$PWD/neard" >> "$GITHUB_ENV"
- run: chmod +x "$CURRENT_NEARD"
- run: pip3 install --user -r requirements.txt
- run: python3 tests/sanity/backward_compatible.py
token: ${{ secrets.CODECOV_TOKEN }}
files: pytest-backcomp.json
fail_ci_if_error: true
flags: pytests,backward-compatibility,linux

py_db_migration:
name: "Database Migration"
needs: build_binary
runs-on: ubuntu-22.04
defaults:
run:
working-directory: ./pytest
runs-on: ubuntu-22.04-8core
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v4
with:
python-version: 3.11
cache: pip
- uses: actions/download-artifact@v3
- uses: baptiste0928/cargo-install@21a18ba3bf4a184d1804e8b759930d3471b1c941
with:
name: neard
path: pytest # NB: this does not account for defaults.run.working-directory
- run: echo "CURRENT_NEARD=$PWD/neard" >> "$GITHUB_ENV"
crate: cargo-llvm-cov
- uses: Swatinem/rust-cache@a95ba195448af2da9b00fb742d14ffaaf3c21f43
with:
prefix-key: "0" # change this to invalidate CI cache
shared-key: "cargo_nextest-linux"
save-if: "false" # use the cache from nextest, but don’t double-save
- run: pip3 install --user -r pytest/requirements.txt
- run: cargo llvm-cov show-env | tr -d "'" >> "$GITHUB_ENV"
- run: cargo build --locked --profile dev-release -p neard --bin neard
- run: echo "CURRENT_NEARD=$PWD/target/dev-release/neard" >> "$GITHUB_ENV"
- run: echo "NEAR_ROOT=$PWD" >> "$GITHUB_ENV"
- run: chmod +x "$CURRENT_NEARD"
- run: pip3 install --user -r requirements.txt
- run: python3 tests/sanity/db_migration.py
- run: cd pytest && python3 tests/sanity/db_migration.py
- run: cargo llvm-cov report --profile dev-release --codecov --output-path pytest-dbmigr.json
- uses: codecov/codecov-action@eaaf4bedf32dbdc6b720b63067d99c4d77d6047d
with:
token: ${{ secrets.CODECOV_TOKEN }}
files: pytest-dbmigr.json
fail_ci_if_error: true
flags: pytests,db-migration,linux

py_sanity_checks:
name: "Sanity Checks"
Expand All @@ -132,6 +170,9 @@ jobs:
with:
python-version: 3.11
cache: pip
- uses: baptiste0928/cargo-install@21a18ba3bf4a184d1804e8b759930d3471b1c941
with:
crate: cargo-llvm-cov
- uses: Swatinem/rust-cache@a95ba195448af2da9b00fb742d14ffaaf3c21f43
with:
prefix-key: "0" # change this to invalidate CI cache
Expand All @@ -140,32 +181,51 @@ jobs:
- run: pip3 install --user -r pytest/requirements.txt
# This is the only job that uses `--features nightly` so we build this in-line instead of a
# separate job like done with the regular neard.
- run: cargo llvm-cov show-env | tr -d "'" >> "$GITHUB_ENV"
- run: cargo build --profile dev-release -p neard --bin neard --features nightly
# Note: We're not running spin_up_cluster.py for non-nightly
# because spinning up non-nightly clusters is already covered
# by other steps in the CI, e.g. upgradable.
- run: python3 pytest/tests/sanity/spin_up_cluster.py
env:
NEAR_ROOT: "target/dev-release"
- run: cargo llvm-cov report --profile dev-release --codecov --output-path pytest-sanity.json
- uses: codecov/codecov-action@eaaf4bedf32dbdc6b720b63067d99c4d77d6047d
with:
token: ${{ secrets.CODECOV_TOKEN }}
files: pytest-sanity.json
fail_ci_if_error: true
flags: pytests,sanity-checks,linux-nightly

py_genesis_check:
name: "Genesis Changes"
needs: build_binary
runs-on: ubuntu-22.04
runs-on: ubuntu-22.04-8core
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v4
with:
python-version: 3.11
cache: pip
- uses: actions/download-artifact@v3
- uses: baptiste0928/cargo-install@21a18ba3bf4a184d1804e8b759930d3471b1c941
with:
name: neard
path: target/dev-release
- run: echo "CURRENT_NEARD=$PWD/target/dev-release/neard" >> "$GITHUB_ENV"
- run: chmod +x "$CURRENT_NEARD"
crate: cargo-llvm-cov
- uses: Swatinem/rust-cache@a95ba195448af2da9b00fb742d14ffaaf3c21f43
with:
prefix-key: "0" # change this to invalidate CI cache
shared-key: "cargo_nextest-linux"
save-if: "false" # use the cache from nextest, but don’t double-save
- run: pip3 install --user -r pytest/requirements.txt
- run: cargo llvm-cov show-env | tr -d "'" >> "$GITHUB_ENV"
- run: cargo build --locked --profile dev-release -p neard --bin neard
- run: echo "CURRENT_NEARD=$PWD/target/dev-release/neard" >> "$GITHUB_ENV"
- run: python3 scripts/state/update_res.py check
- run: cargo llvm-cov report --profile dev-release --codecov --output-path pytest-genesischk.json
- uses: codecov/codecov-action@eaaf4bedf32dbdc6b720b63067d99c4d77d6047d
with:
token: ${{ secrets.CODECOV_TOKEN }}
files: pytest-genesischk.json
fail_ci_if_error: true
flags: pytests,genesis-check,linux

py_style_check:
name: "Style"
Expand All @@ -184,25 +244,33 @@ jobs:

py_upgradability:
name: "Upgradability"
needs: build_binary
runs-on: ubuntu-22.04
defaults:
run:
working-directory: ./pytest
runs-on: ubuntu-22.04-8core
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v4
with:
python-version: 3.11
cache: pip
- uses: actions/download-artifact@v3
with:
name: neard
path: pytest # NB: this does not account for defaults.run.working-directory
- run: echo "CURRENT_NEARD=$PWD/neard" >> "$GITHUB_ENV"
- run: chmod +x "$CURRENT_NEARD"
- run: pip3 install --user -r requirements.txt
- run: python3 tests/sanity/upgradable.py
- uses: baptiste0928/cargo-install@21a18ba3bf4a184d1804e8b759930d3471b1c941
with:
crate: cargo-llvm-cov
- uses: Swatinem/rust-cache@a95ba195448af2da9b00fb742d14ffaaf3c21f43
with:
prefix-key: "0" # change this to invalidate CI cache
shared-key: "cargo_nextest-linux"
save-if: "false" # use the cache from nextest, but don’t double-save
- run: pip3 install --user -r pytest/requirements.txt
- run: cargo llvm-cov show-env | tr -d "'" >> "$GITHUB_ENV"
- run: cargo build --locked --profile dev-release -p neard --bin neard
- run: echo "CURRENT_NEARD=$PWD/target/dev-release/neard" >> "$GITHUB_ENV"
- run: cd pytest && python3 tests/sanity/upgradable.py
- run: cargo llvm-cov report --profile dev-release --codecov --output-path pytest-upgradability.json
- uses: codecov/codecov-action@eaaf4bedf32dbdc6b720b63067d99c4d77d6047d
with:
token: ${{ secrets.CODECOV_TOKEN }}
files: pytest-upgradability.json
fail_ci_if_error: true
flags: pytests,upgradability,linux

rpc_error_schema:
name: "RPC Schema"
Expand Down