Enabled code coverage measurement#280

Draft

jxyang wants to merge 2 commits intooe_portfrom

Contributor

jxyang commented May 20, 2020

This enables code coverage with a build config, namely:

make CODE_COVERAGE=true

Once built sgx-lkl this way, the script .azure-pipelines/scripts/measure_code_cov.sh is used to run test cases under tests and the final accumulated code coverage data is posted to sgx-lkl/total_cov.info.


          Enabled code coverage measurement

dc0d890

mingweishih approved these changes

View reviewed changes

mingweishih left a comment

LGTM with just comments on the hard-coded paths.
Since this PR would take dependency on the libgcc/libgcov and the Makefiles of sgx-musl, I assume those will be taken care of before this PR goes in.

Also, the lcov will be the prerequisite of the measurement. CI/CD should handle that as well.

The follow-up will be converting the lcov report to other form so that the devops pipeline can consume. One suggestion would be cobertura, and the conversion can be done with the python tool.

.azure-pipelines/scripts/measure_one_cov.sh Outdated

+                  # Gather all necessary files for lcov
+                  cp -r $SGXLKL_ROOT/src/* $SGXLKL_ROOT/cov
+                  sudo cp -r img/home/xuejun/sgx-lkl/build_musl $SGXLKL_ROOT/cov

mingweishih May 20, 2020

I think the hard-coded path needs to be changed.

.azure-pipelines/scripts/measure_one_cov.sh Outdated

+                  # Gather all necessary files for lcov
+                  cp -r $SGXLKL_ROOT/src/* $SGXLKL_ROOT/cov
+                  sudo cp -r img/home/xuejun/sgx-lkl/build_musl $SGXLKL_ROOT/cov
+                  sudo cp -r img/home/xuejun/sgx-lkl/sgx-lkl-musl $SGXLKL_ROOT/cov

mingweishih May 20, 2020

Same here.

hukoyu approved these changes

View reviewed changes

Collaborator

hukoyu left a comment

A few comments made.

.azure-pipelines/scripts/measure_one_cov.sh Outdated

+              # Get the timeout from the test module
+              DEFAULT_TIMEOUT=300
+              timeout=$(make gettimeout 2> /dev/null)
+              [[ $? != 0 ]] && timeout=$DEFAULT_TIMEOUT

Collaborator

hukoyu May 20, 2020

-ne will be better for number comparison

.azure-pipelines/scripts/measure_one_cov.sh Outdated

+              timeout --kill-after=$(($timeout + 15))  $timeout make run-hw
+              make_exit=$?
+              if [ $make_exit != 0 ]; then

Collaborator

hukoyu May 20, 2020

-ne

.azure-pipelines/scripts/measure_one_cov.sh Outdated

+              timeout --kill-after=$(($timeout + 15))  $timeout make run-sw
+              make_exit=$?
+              if [ $make_exit != 0 ]; then

Collaborator

hukoyu May 20, 2020

-ne

config.mak Show resolved Hide resolved

letmaik reviewed

View reviewed changes

.azure-pipelines/scripts/measure_code_cov.sh

		@@ -0,0 +1,31 @@
		#!/bin/bash

Contributor

letmaik May 24, 2020

I don't think it makes sense that there are new scripts which duplicate the non-coverage test logic. Why not extend the existing scripts and add an extra environment variable to enable it?

Contributor Author

jxyang May 26, 2020 •

edited

Loading

There are key differences between this script and the test runner script.

This script is conditioned on the fact that sgx-lkl was built with CODE_COVERAGE=1
We ideally want to include samples in the code coverage measurement
For each test/sample, we run run-hw and run-sw back to back, and extract the collective coverage data from the image.
The objective is to measure code coverage. So the coverage data for failed test are included in the final aggregated code coverage metrics.

If you compare this script and test_runner, also measure_one_cov with run_test, there aren't many duplicated code. I prefer to keep them separate for easier maintenance.

Contributor

letmaik May 27, 2020

Thanks for explaining. I'm trying to think how this fits into CI. Given that debug/release build modes have slightly different code paths it may make sense to capture code coverage in both modes and merge it. Or we just do it for one mode if the differences are not worth measuring. Either way, it would mean adding at least one new build job with CODE_COVERAGE=1 and corresponding test/coverage job(s).
You're saying that your script runs both hw and sw. This is a bit unfortunate because it prevents running sw on non-SGX machines and in parallel. Would it be possible to run them separately, dump the coverage files and merge them in a follow-up step?

Contributor Author

jxyang May 29, 2020

It's hard to merge coverage data from two VMs. It requires a global vm for aggregation. We will leave the work of integrating into pipeline in another PR.

Contributor

letmaik May 29, 2020

That's actually not that hard to do, you can specify dependencies between jobs and use build artifacts to upload/download job data.


          Address reiew comments and update commit sha to sgx-lkl-musl

2f157ae

jxyang marked this pull request as draft

August 4, 2020 00:46

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet