Skip to content
richard-r edited this page Feb 5, 2015 · 31 revisions

In this tutorial, we will cover how to test a piece of code used for calculating volume curves with qspec and using qspec to help ensure the validity of the code as we change and refactor it to suit the changing capabilities that the code needs to support. Familiarity with q is assumed and the individual qspec functions are not covered in depth. Please refer to the reference pages for further details.

Basics

To start with, let’s take a look at the code currently being used for calculating the volume curves.


curve:{[syms;start;end];
  v:select vol: sum size by sym, date, time.minute from trade where date within `date$(start;end), sym in syms, time within `time$(start;end);
  tv: exec sum vol by sym from v;
  numDates:exec count distinct date from v;
  `sym`minute xasc select avgBucket: sum[vol]%numDates, pctDaily:sum[vol]%tv[first sym] by sym,minute from v
  }

This is pretty simple and might represent what we would have after a very preliminary pass to get something up and working. The results it produces look like this:


q)curve[`IBM`MSFT;2009.11.01T09:30;2009.11.30T16:00]                                                                                                         
sym minute| avgBucket pctDaily    
----------| ----------------------
IBM 09:30 | 1980      0.00466508  
IBM 09:31 | 2316.667  0.005458301 
IBM 09:32 | 2680      0.006314351 
IBM 09:33 | 2390      0.005631082 
IBM 09:34 | 1046.667  0.002466053 
IBM 09:35 | 330       0.0007775134
IBM 09:36 | 2296.667  0.005411179 
IBM 09:37 | 1000      0.002356101 
IBM 09:38 | 403.3333  0.0009502941
IBM 09:39 | 723.3333  0.001704246 
IBM 09:40 | 606.6667  0.001429368 
IBM 09:41 | 310       0.0007303913
IBM 09:42 | 1270      0.002992248 
IBM 09:43 | 2240      0.005277667 
IBM 09:44 | 650       0.001531466 
IBM 09:45 | 1033.333  0.002434638 
IBM 09:46 | 430       0.001013123 
IBM 09:47 | 2400      0.005654643 
IBM 09:48 | 1003.333  0.002363955 
IBM 09:49 | 946.6667  0.002230442

Now, ideally, you would actually be using somewhat real data to test against. Since I don’t have a contract with Reuters to republish marketdata or any generously donated sample data sets, I’ll make due with some random data that I cooked up.

If you would like to follow along exactly, here is the code I’m using to make the random data and the commands used to generate it:


gen:{[syms;prices;nums;start;days];
  raze (enlist each flip `sym`date!flip syms cross start + til days) cross' {[num;price]
    invAbs:{(count[x]?1 -1)*x};
    n:`int$num + first invAbs 1?(1?1f)*num; / Adjust num to generate within 0-100% of base num up or down
    freqs:0f, sums .8 .1 .05 .01 .01 .01 .016 .001 .001 .001;
    sizes:100 200 300 400 500 1000 10000 20000 30000 40000 50000;
    ([]time:asc 09:30t + n?16t - 09:30t;price:price + invAbs n?first 1?.1;size:sizes freqs bin n?1f)
    } .' raze days#'enlist each nums,'prices
  }


q ../gen.q -S 200
KDB+ 2.6 2009.12.04 Copyright (C) 1993-2009 Kx Systems
q)`:trade set gen[`IBM`MSFT`AAPL;100 30 200f;1000 2000 10000;2009.11.01;30]
`:trade

Okay, so now that there is some data and some code, we have something that we can test. Let’s start with some pretty basic properties that the code should have. First, the column indicating the percentage of the daily volume should probably add up to 1 for each symbol. Next, if we were to calculate average daily volume and then sum the average bucket column, we should have matching values. Finally, dividing the average bucket values by the average daily volume should yield values equal to the percentage of the daily volume. We will begin by expressing these as @expectations@ inside of a @specification@:


.tst.desc["Volume curves"]{
  should["have percentage values that add to 1"]{};
  should["have the sum of average bucket volumes be equal to the average daily volume"]{};
  should["have the percentage of daily volume match the average bucket column divided by ADV"]{};
  };

Note that we haven’t written any actual tests yet, just jotted down what we’re going to be testing and roughly what behaviors we expect it to conform to. Even this is enough to get some output from the testing tool however:


testq test_curve-vA.q --desc
Volume curves::
- should have percentage values that add to 1
- should have the sum of average bucket volumes be equal to the average daily volume
- should have the percentage of daily volume match the average bucket column divided by ADV

Now, let’s move on to defining one of the tests.


.tst.desc["Volume curves"]{
  should["have percentage values that add to 1"]{
    fixture `trade;
    c:curve[`IBM`MSFT;2009.11.01T09:30;2009.11.30T16:00];
    (value exec first sum pctDaily by sym from c) musteq 1f;
    c:curve[`IBM`MSFT;2009.11.01T12:00;2009.11.15T15:00];
    (value exec first sum pctDaily by sym from c) musteq 1f;
    };
  should["have the sum of average bucket volumes be equal to the average daily volume"]{};
  should["have the percentage of daily volume match the average bucket column divided by ADV"]{};
  };

We’ve established some simple tests that the outputs of the curve function match the expectation. These are pretty much happy path scenarios, but they’ll be good enough for now. Also note that we have used the @fixture@ function to load in the test data that we generated earlier.

Running this test yields the following:


testq curve-v1.q test_curve-vB.q ...

For 1 specifications, 3 expectations were run.
3 passed, 0 failed. 0 errors.


Currently, empty specifications will report as having passed. For the time being, please be careful that you ensure not to leave a test empty before committing. I apologize for the inconvenience.

Moving on, we’ll fill in the last two tests:


.tst.desc["Volume curves"]{
  before{
    fixture `trade;
    };
  should["have percentage values that add to 1"]{
    c:curve[`IBM`MSFT;2009.11.01T09:30;2009.11.30T16:00];
    (value exec first sum pctDaily by sym from c) musteq 1f;
    c:curve[`IBM`MSFT;2009.11.01T12:00;2009.11.15T15:00];
    (value exec first sum pctDaily by sym from c) musteq 1f;
    };
  alt{
    before{
      fixture `trade;
      `adv mock exec avg vol by sym from select vol:sum size by sym, date from trade where date within 2009.11.01 2009.11.30,time within 09:30 16:00;
      };
    should["have the sum of average bucket volumes be equal to the average daily volume"]{
      c:curve[`IBM`MSFT;2009.11.01T09:30;2009.11.30T16:00];
      (exec first sum avgBucket by sym from c)[`MSFT`IBM] musteq adv[`MSFT`IBM];
      };
    should["have the percentage of daily volume match the average bucket column divided by ADV"]{
      c:curve[`IBM`MSFT;2009.11.01T09:30;2009.11.30T16:00];
      (exec avgBucket%adv[first sym] by sym from c)[key pd] mustmatch pd[key pd:exec pctDaily by sym from c;];
      };
    };
  };

With these tests, we’ve added an @alt@ block, which allows us to replace the before function being used without interrupting the existing before block. In this case, we’ve used the alternate before block to create a dictionary containing a dictionary holding the ADV over the period we are running the tests on. These additional tests should give us a reasonable level of confidence that the volume curves are being calculated correctly.

Refactoring

Change 1

Now that the curves have been in use for a while, a new request comes in asking to be able to run the curve for each symbol on a different time period.

The first thing to do is to create a new test for this scenario. This should let us know what breaks when we use our code in an unexpected way.


.tst.desc["Volume curves"]{
  before{
    fixture `trade;
    };
  ...
  should["be able to retrieve curves across different time periods"]{
    mustnotthrow[();(`curve;`IBM`MSFT;2009.11.01T09:30 2009.11.01T09:35;2009.11.30T16:00 2009.11.30T15:55)];
    };
  };

(I have omitted code here that was previously covered, and will do so for the remainder of the tutorial.)

And running it:


testq curve-v1.q test_curve-vD.q ...F

Volume curves::
- should be able to retrieve curves across different time periods:
Failure: Expected ‘(`curve;`IBM`MSFT;2009.11.01T09:30:00.000 2009.11.01T09:35:00.000;2009.11.30T16:00:00.000 2009.11.30T15:55:00.000)’ to not throw an error. Error thrown: ‘length’
1 assertion was run.
Before code:
{fixture `trade;
}
Test code:
{mustnotthrow[();(`curve;`IBM`MSFT;2009.11.01T09:30 2009.11.01T09:35;2009.11.30T16:00 2009.11.30T15:55)];
}

For 1 specifications, 4 expectations were run.
3 passed, 1 failed. 0 errors.


It looks like this fails. Investigating the source code quickly shows that we are attempting to use within on vectors containing very many elements (the date and time columns) against vectors containing a small number of elements (the start and end arguments). What is needed is to expand the right hand vectors to match the length of the left hand vector. We will accomplish this with a keyed table and retry.


curve:{[syms;start;end];
  kt:([sym:syms];start;end);
  v:select vol: sum size by sym, date, time.minute from trade where date within `date$kt[([]sym)]`start`end, sym in syms, time within `time$kt[([]sym)]`start`end;
  ...
  }


testq curve-v2.q test_curve-vD.q
EEE.

Volume curves::
- should have percentage values that add to 1:
Error: testError 'type
0 assertions were run.
Before code: 
{fixture `trade;
    }
Test code: 
{c:curve[`IBM`MSFT;2009.11.01T09:30;2009.11.30T16:00];
    (value exec first sum pctDaily by sym from c) musteq 1f;
    c:curve[`IBM`MSFT;2009.11.01T12:00;2009.11.15T15:00];
    (value exec first sum pctDaily by sym from c) musteq 1f;
    }

- should have the sum of average bucket volumes be equal to the average daily volume:
Error: testError 'type
0 assertions were run.
Before code: 
{fixture `trade;
      `adv mock exec avg vol by sym from select vol:sum size by sym, date from trade where date within 2009.11.01 2009.11.30,time within 09:30 16:00;
      }
Test code: 
{c:curve[`IBM`MSFT;2009.11.01T09:30;2009.11.30T16:00];
      (exec first sum avgBucket by sym from c)[`MSFT`IBM] musteq adv[`MSFT`IBM];
      }

- should have the percentage of daily volume match the average bucket column divided by ADV:
Error: testError 'type
0 assertions were run.
Before code: 
{fixture `trade;
      `adv mock exec avg vol by sym from select vol:sum size by sym, date from trade where date within 2009.11.01 2009.11.30,time within 09:30 16:00;
      }
Test code: 
{c:curve[`IBM`MSFT;2009.11.01T09:30;2009.11.30T16:00];
      (exec avgBucket%adv[first sym] by sym from c)[key pd] mustmatch pd[key pd:exec pctDaily by sym from c;];
      }


For 1 specifications, 4 expectations were run.
1 passed, 0 failed.  3 errors.

Oops. Looks like we had a regression. This one actually took me a few minutes to figure out while writing this. In short, when creating a keyed table literal, atoms in columns in the non-key part of the literal will not expand to lists if there are only atoms in columns in the non-key part. E.g. You should see the following:


KDB+ 2.6 2009.12.04 Copyright (C) 1993-2009 Kx Systems
m32/ 2()core 4096MB danno monad.local 127.0.0.1 PLAY 2010.03.04 
q)([a:`a`b];c:1;d:3)                                                            
'type
q)([a:`a`b];c:2#1;d:3)                                                          
a| c d
-| ---
a| 1 3
b| 1 3

Let’s fix that and try again.


curve:{[syms;start;end];
  kt:([sym:syms];count[syms]#start;end);
  ...
  }


testq curve-v3.q test_curve-vD.q 
....

For 1 specifications, 4 expectations were run.
4 passed, 0 failed.  0 errors.

Change 2

After finishing this, it turns out that we have interpreted the requirements incorrectly; what is actually desired is to vary the time and date for each symbol in the request. Again, the first thing to do is to add a test for this scenario.


.tst.desc["Volume curves"]{
  before{
    fixture `trade;
    };
  ...
  should["be able to retrieve curves across different date/time periods"]{
    mustnotthrow[();(`curve;`IBM`MSFT;2009.11.02T09:30 2009.11.01T09:35;2009.11.28T16:00 2009.11.25T15:55)];
    };
  };

And run.


.....

For 1 specifications, 5 expectations were run.
5 passed, 0 failed.  0 errors.

Ah, excellent. The code was already suitable for this variation.

Change 3

Again, we’ll imagine that the requirements have changed in more ways. This time, there are two changes requested: the user wishes to have the total time and total number of days that the curve was calculated over returned with the curve and they would like to be able to generate multiple curves for the same symbol (i.e. MSFT over two separate date ranges).

As an aside, I’d like to mention that it is not strictly necessary to always write your tests before you write your code. Within this tutorial, I have been creating the tests beforehand simply for consistency. In practice, I sometimes find it easier to experiment with code and knock together something bascially functional before writing the tests. Use your discretion to decide when you should start adding tests, but keep in mind that it’s much harder to get decent test coverage from scratch than it is to grow your tests along with your code.


.tst.desc["Volume curves"]{
  ...
  should["be able to retrieve two curves for the same symbol across different date/time periods"]{
    c:curve[`IBM`IBM;2009.11.02T09:30 2009.11.01T09:35;2009.11.28T16:00 2009.11.25T15:55];
    count[c] musteq 2;
    };
  should["return the time span the curve was calculated over"]{
    c:curve[`IBM`MSFT;ts:2009.11.01T09:30;te:2009.11.30T16:00];
    (exec span from c) musteq `time$`datetime$te - ts;
    };
  should["return the date range that the curve was calculated over"]{
    c:curve[`IBM`MSFT;ts:2009.11.01T09:30;te:2009.11.30T16:00];
    (exec range from c) musteq `date$`datetime$te - ts;
    };
  };

Writing the tests for these two new features, it becomes apparent that the format of our result set needs to change and, as expected, the tests fail.


.....FEE

Volume curves::
- should be able to retrieve two curves for the same symbol across different date/time periods:
Failure: Expected 390 to be equal to 2
...
- should return the time span the curve was calculated over:
Error: testError 'span
...
- should return the date range that the curve was calculated over:
Error: testError 'range
...

For 1 specifications, 8 expectations were run.
5 passed, 1 failed.  2 errors.

So we’ll now go and update the code to make the new tests pass. Before we run the test though, let’s change the format of the trade fixture from a single file to a partitioned database


mv trade tradef
mkdir trade
q tradef
q)\cd trade
q){t:` sv (`:.;`$string y;`trade;`);.[t;();:;.Q.en[`:.] select from x where date = y];@[t;`sym;`p#]}[trade] each exec distinct date from trade
`:./2009.11.01/trade/`:./2009.11.02/trade/`:./2009.11.03/trade/`:./2009.11.04..


curve:{[syms;start;end];
  ot:([]sym:syms;count[syms]#start;end);
  datespans:{(`time$(x;y)) +/: (`date$x) + til 1 + (`date$y) - `date$x};
  rt:(flip `sym`start`end!flip raze syms cross' datespans'[count[syms]#start;end]);
  dates:exec distinct `date$start from rt;
  syms:exec distinct sym by `date$start from rt;
  times:{$[1 = count x;first x;flip x]} each boundaries each exec `time$flip (start;end) by sym, date:`date$start from rt;
  v:select vol: sum size by sym, date, time.minute from trade where date in dates, sym in syms[first date], any each time within' times[([]sym;date)];
  ot,'individualCurve[v]'[ot`sym;ot`start;ot`end]
  }

individualCurve:{[v;symbol;s;e];
  v:select from v where date within `date$(s;e), sym = symbol, minute within `minute$(s;e);
  tv: exec sum vol from v;
  numDates:exec count distinct date from v;
  `range`span`payload!(numDates;(`time$e) - (`time$s);() xkey select avgBucket: sum[vol]%numDates, pctDaily:sum[vol]%tv by minute from v)
  }

boundaries:{
  {
    $[any last[x] within y; 
      (-1 _ x),enlist (min (last[x] 0;y 0);max (last[x] 1;y 1));
      x,enlist y
      ]
    }/[enlist x 0;1 _ x:asc x]
  }

A fair bit needed to be added this time, including a helper function that makes it possible to retrieve just the trades required for the curves. What I’ve decided to do here is eliminate from retrieval any records that don’t fall within one of the requested curves while only performing one access to the on disk database. It would have been a little simpler to just retrieve any record that fell inside the time bounds of the inputs, but that could have some very bad degenerate cases where a narrow band is selected over a large number of days and a wide band would be selected over a small number of days causing lots and lots of unneeded data to be pulled in.

Note that since the format of the output also changed this time, some of the other tests will likely break:


testq curve-v4.q test_curve-vF.q
EEE.....

Volume curves::
- should have percentage values that add to 1:
Error: testError 'pctDaily
...

- should have the sum of average bucket volumes be equal to the average daily volume:
Error: testError 'avgBucket
...

- should have the percentage of daily volume match the average bucket column divided by ADV:
Error: testError 'pctDaily
...

For 1 specifications, 8 expectations were run.
5 passed, 0 failed.  3 errors.

We’ll need to go back in and edit our earlier tests to match the different output format.


.tst.desc["Volume curves"]{
  before{
    fixture `trade;
    };
  should["have percentage values that add to 1"]{
    c:curve[`IBM`MSFT;2009.11.01T09:30;2009.11.30T16:00];
    (exec {sum x`pctDaily} each payload from c) musteq 1f;
    c:curve[`IBM`MSFT;2009.11.01T12:00;2009.11.15T15:00];
    (exec {sum x`pctDaily} each payload from c) musteq 1f;
    };
  alt{
    before{
      fixture `trade;
      `adv mock exec avg vol by sym from select vol:sum size by sym, date from trade where date within 2009.11.01 2009.11.30,time within 09:30 16:00;
      };
    should["have the sum of average bucket volumes be equal to the average daily volume"]{
      c:curve[`IBM`MSFT;2009.11.01T09:30;2009.11.30T16:00];
      (exec sum first[payload]`avgBucket by sym from c)[`MSFT`IBM] musteq adv[`MSFT`IBM];
      };
    should["have the percentage of daily volume match the average bucket column divided by ADV"]{
      c:curve[`IBM`MSFT;2009.11.01T09:30;2009.11.30T16:00];
      (exec (first[payload]`avgBucket)%adv[first sym] by sym from c)[key pd] mustmatch pd[key pd:exec first[payload]`pctDaily by sym from c;];
      };
    };
  ...
  };

So, not much had to change with the tests, just a few tweaks to get the data out of the new format. And it passes:


testq curve-v4.q test_curve-vG.q 
........

For 1 specifications, 8 expectations were run.
8 passed, 0 failed.  0 errors.

Finally, let’s add a test or two for that helper function since it turned out to be pretty important and we didn’t think of it as part of the design beforehand while coming up with the new test cases.


...
.tst.desc["A Time Boundary calculator"]{
  should["find the minimal non-intersecting boundaries in a list of pairs"]{
    boundaries[(2 4;1 3;5 6)] mustmatch (1 4;5 6);
    boundaries[(3 4t;3 9t;2 6t)] mustmatch enlist 2 9t;
    boundaries[(1 4t;5 9t;12 16t)] mustmatch (1 4t;5 9t;12 16t);
    };
  };

Since the description of this behaviour doesn’t fit well within the previous specification, I’ve given it its own.


testq curve-v4.q test_curve-vH.q
........F

A Time Boundary calculator::
- should find the minimal non-intersecting boundaries in a list of pairs:
Failure: Expected (02:00:00.000 06:00:00.000;03:00:00.000 09:00:00.000) to match ,02:00:00.000 09:00:00.000
3 assertions were run.
Test code:
{boundaries[(2 4;1 3;5 6)] mustmatch (1 4;5 6);
  boundaries[(3 4t;3 9t;2 6t)] mustmatch enlist 2 9t;
  boundaries[reverse (1 4t;5 9t;12 16t)] mustmatch (1 4t;5 9t;12 16t);
  }


For 2 specifications, 9 expectations were run.
8 passed, 1 failed.  0 errors.

Whoops! Guess I’ve made a logic error in the boundaries function. Taking a look at it, it’s pretty clear that the conditional test line is the culprit, causing 3 4t to be appended rather than integrated with the existing boundary. The fix is simple:


...
boundaries:{
  {
    $[any last[x] within y; 
      (-1 _ x),enlist (min (last[x] 0;y 0);max (last[x] 1;y 1));
      all y within last x;
      x;
      x,enlist y
      ]
    }/[enlist x 0;1 _ x:asc x]
  }

And now the tests pass.


testq curve-v5.q test_curve-vH.q 
.........

For 2 specifications, 9 expectations were run.
9 passed, 0 failed.  0 errors.

Conclusion

We’ll stop here. You’ve seen how to get some basic tests written with previously untested business logic, how to iterate with tests across a few specification changes, and how to use mock data with your tests.

I hope you’ve noticed the following things throughout the tutorial. First, usually we try to test only the interface of the code in question: what goes in and comes out. This is almost always the best way to make sure that your tests are durable to changes. It’s not always possible and when making system calls or interfacing with existing API’s it’s almost surely impossible, so don’t worry too much when you violate that maxim. Second, good unit testing is an incomplete, iterative process; it won’t prevent bugs that you don’t think to look for and it can only test as far as the limits of your code. Third, if you are, in fact, a perfect programmer, you don’t need unit tests. At least some of the mistakes that fall out through testing in this tutorial are mistakes I actually made while writing the volume curve code. I didn’t include all of my mistakes or you’d never finish reading it. Since I am not a perfect programmer, I try to guard my deficiencies with tests.

All the versions of the code files are available here if you’d like to play with them as well.