|
| 1 | +Testing |
| 2 | +======= |
| 3 | + |
| 4 | +Testing is crucial for validating new features and preventing regressions. |
| 5 | +It is important that tests are written for any new features, as well as any |
| 6 | +identified bugs. |
| 7 | +It is important that we should have a clear understanding of what each test |
| 8 | +is testing, whether it currently passes and, if not, why it is currently |
| 9 | +not working. |
| 10 | +This page will describe some tools in the codebase which help with developing |
| 11 | +and maintaining test cases. |
| 12 | + |
| 13 | +Writing test cases |
| 14 | +------------------ |
| 15 | +Test cases are written in Scalatest using its |
| 16 | +[`AnyFunSuite`](https://www.scalatest.org/scaladoc/3.1.2/org/scalatest/funsuite/AnyFunSuite.html) style. |
| 17 | +See the AnyFunSuite documentation or existing test cases for syntax and examples. |
| 18 | + |
| 19 | +### Exporting IR structures into test cases |
| 20 | + |
| 21 | +Often, you might have found a particular Basil IR program which demonstrates some bug in the code. |
| 22 | +It is good practice to extract this into a test case, both to validate the fix and ensure the bug doesn't reoccur. |
| 23 | +To help with this, a Basil IR program can be converted to |
| 24 | +a Scala literal by using the [ToScala](https://github.com/UQ-PAC/BASIL/blob/main/src/main/scala/ir/dsl/ToScala.scala) |
| 25 | +trait. |
| 26 | + |
| 27 | +To do this, first `import ir.dsl.given`, then you can use the `.toScala` extension method on programs, procedures, or blocks. |
| 28 | +This gives you a string which is valid Scala code. |
| 29 | +This can be copied and pasted into a unit test. |
| 30 | +When executed, the Scala code to will re-construct that Basil IR structure using the DSL. |
| 31 | + |
| 32 | +### Tagging test suites |
| 33 | + |
| 34 | +[Tags](https://www.scalatest.org/scaladoc/3.2.1/org/scalatest/Tag.html) |
| 35 | +are used to categorise test classes based on, roughly, the kind of test (e.g., unit tests or system (end-to-end) tests). |
| 36 | +Each test suite should be tagged with one of the |
| 37 | +[`@test_util.tags.*Test`](https://github.com/UQ-PAC/BASIL/tree/main/src/test/scala/test_util/tags) tags, |
| 38 | +placed on the line before the AnyFunSuite class declaration. |
| 39 | +A test suite may, additionally, be tagged with one or more of the supplementary tags (those not ending in Test). |
| 40 | + |
| 41 | +To run only tests from a specific tag, you can use |
| 42 | +```bash |
| 43 | +./scripts/scalatest.sh -o -n test_utils.tags.TagName |
| 44 | +``` |
| 45 | +Note that the tag name must be fully-qualified (i.e., including the package name). |
| 46 | +See [the Scalatest runner docs](https://www.scalatest.org/user_guide/using_the_runner) or the `scalatest.sh` file |
| 47 | +for more options. |
| 48 | + |
| 49 | +### Dynamic tests |
| 50 | + |
| 51 | +Note that the `test("test name")` method can be written anywhere within a AnyFunSuite body, including |
| 52 | +within loops or conditionals. |
| 53 | +This allows you to dynamically generate test cases, as in |
| 54 | +Basil's [`SystemTests`](/src/test/scala/SystemTests.scala). |
| 55 | +This should be used sparingly. |
| 56 | + |
| 57 | + |
| 58 | +Maintaining test cases |
| 59 | +---------------------- |
| 60 | +Over time, test cases might break due to code changes or refactoring. |
| 61 | +Of course, failing tests should be fixed as soon as possible. |
| 62 | +However, this is not always possible - maybe a test case relies on features not yet implemented. |
| 63 | +It might be reasonable to allow a test to fail for a period of time until the fixes are ready. |
| 64 | +In these cases, it is important that tests which are known/expected to fail are clearly marked |
| 65 | +and the reason for their failure should be recorded in the code. |
| 66 | + |
| 67 | +Generally, the strategy is that failing tests should still be executed. |
| 68 | +They should be annotated so that they are allowed to fail, but if they start passing, |
| 69 | +that should raise an error until the annotation is removed. |
| 70 | +This allows the test code to be an accurate record of the expected outcome of each test. |
| 71 | + |
| 72 | + |
| 73 | +### pendingUntilFixed (for expected failures) |
| 74 | + |
| 75 | +[`pendingUntilFixed`](https://www.scalatest.org/scaladoc/3.2.3/org/scalatest/Assertions.html#pendingUntilFixed(f:=%3EUnit)(implicitpos:org.scalactic.source.Position):org.scalatest.Assertionwithorg.scalatest.PendingStatement) should be used to mark a block of code (typically containing an assertion) as one which is currently expected to fail. This should be used to record test cases which fail due to not-yet-implemented features or known bugs. If a change is made and the code no longer fails, this will cause the test to fail until the pendingUntilFixed is removed. |
| 76 | + |
| 77 | +It should be used within a `test() { ... }` block like so, with a comment documenting the cause of failure and expected future resolution. |
| 78 | +```scala |
| 79 | +test("1 == 2 sometime soon?") { |
| 80 | + assert(1 == 1, "obviously") |
| 81 | + |
| 82 | + // broken until we fix maths |
| 83 | + pendingUntilFixed { |
| 84 | + assert(1 == 2, "todo") |
| 85 | + } |
| 86 | +} |
| 87 | +``` |
| 88 | + |
| 89 | +- Tests can be marked as ignored by replacing `test()` with `ignore()`. An entire suite can be marked as ignored with the `@org.scalatest.Ignore` annotation. |
| 90 | + |
| 91 | +### TestCustomisation (for dynamically-generated tests) |
| 92 | + |
| 93 | +For some tests, particularly those which are dynamically-generated by a loop, it is not practical to add a `pendingUntilFixed` block |
| 94 | +into the test body. |
| 95 | +For these cases, there is a `trait TestCustomisation` to help with customising dynamically-generated tests |
| 96 | +based on their test case name (for system tests, this includes the file path and compiler/lifter options). |
| 97 | + |
| 98 | +To use this, the test suite class should be made to extend [TestCustomisation](/src/test/scala/test_util/TestCustomisation.scala). |
| 99 | +This defines an abstract method customiseTestsByName which controls the mode of each test case. |
| 100 | +```scala |
| 101 | +@test_util.tags.UnitTest |
| 102 | +class ProcedureSummaryTests extends AnyFunSuite, TestCustomisation { |
| 103 | + |
| 104 | + override def customiseTestsByName(name: String) = { |
| 105 | + name match { |
| 106 | + case "test a" => Mode.NotImplemented("doesn't seem to work yet") |
| 107 | + case _ => Mode.Normal |
| 108 | + } |
| 109 | + } |
| 110 | + |
| 111 | + test("test a") { |
| 112 | + assert(false); |
| 113 | + } |
| 114 | + |
| 115 | + test("test b") { |
| 116 | + assert(true); |
| 117 | + } |
| 118 | +} |
| 119 | +``` |
| 120 | +Test cases can be marked as retry, disabled, not implemented, or temporary failure |
| 121 | +(see TestCustomisation source file for more details). |
| 122 | +This modifies the behaviour of the test case and prints helpful output when running the test. For example: |
| 123 | +```c |
| 124 | +- correct/malloc_with_local3/clang:BAP (pending) |
| 125 | + + NOTE: Test case is customised with: "ExpectFailure(previous failure was: Expected verification success, but got failure. Failing assertion is: assert (load37_1 == R30_in))" |
| 126 | + |
| 127 | + + Failing assertion ./src/test/correct/malloc_with_local3/clang/malloc_with_local3_bap.bpl:264 |
| 128 | + + 261 | load36_1, Gamma_load36_1 := memory_load64_le(mem_9, bvadd64(R31_in, 18446744073709551600bv64)), (gamma_load64(Gamma_mem_9, bvadd64(R31_in, 18446744073709551600bv64)) || L(bvadd64(R31_in, 18446744073709551600bv64))); |
| 129 | + 262 | call rely(); |
| 130 | + 263 | load37_1, Gamma_load37_1 := memory_load64_le(mem_10, bvadd64(R31_in, 18446744073709551608bv64)), (gamma_load64(Gamma_mem_10, bvadd64(R31_in, 18446744073709551608bv64)) || L(bvadd64(R31_in, 18446744073709551608bv64))); |
| 131 | + > 264 | assert (load37_1 == R30_in); //is returning to caller-set R30 |
| 132 | + 265 | goto printCharValue_2260_basil_return; |
| 133 | + 266 | printCharValue_2260_basil_return: |
| 134 | + 267 | assume {:captureState "printCharValue_2260_basil_return"} true; |
| 135 | +``` |
| 136 | +
|
| 137 | +```c |
| 138 | +- analysis_differential:malloc_with_local/gcc_O2:GTIRB (pending) |
| 139 | + + NOTE: Test case is customised with: "ExpectFailure(needs printf_chk)" |
| 140 | +
|
| 141 | + + STDOUT: "" |
| 142 | +``` |
| 143 | + |
| 144 | +Modes which expect failure (temp failure and not implemented) will show as "pending" when |
| 145 | +executing scalatest. |
| 146 | +The disabled mode will show as "cancelled". |
| 147 | +Both of these will be output in yellow text if your console is using colour. |
| 148 | + |
0 commit comments