Quviq QuickCheck: A new automated testing tool
QuickCheck is a clever tool for testing whether software works correctly under a set of rules constraining its behavior (ie: its specification). The rules that the system accepts (written as Erlang programs) generate programs to evaluate or – as in a recent case study – messages to send to remote servers. After a fault is detected, the QuickCheck system will simplify the generated case as much as possible.
From this paper via lambda-the-ultimate.
An interesting example that the authors use is of specifying the relationship between concat (++) and reverse:
prop_reverse () ->
?FORALL({Xs,Ys},
{list(int()),list(int())},
lists:reverse(Xs++Ys) == lists:reverse(Ys) ++ lists:reverse(Xs)).
This says that for any given two lists of integers, you’ll get the same thing if you append them together and then reverse them as if you’d appended the reversal of the second to the reversal of the first. This happens to be a true statement, but if you were to mess up the specification and say (as the authors demonstrate):
prop_reverse () ->
?FORALL({Xs,Ys},
{list(int()),list(int())},
lists:reverse(Xs++Ys) == lists:reverse(Xs) ++ lists:reverse(Ys)).
The QuickCheck system would quickly find a counterexample. As the paper describes, the system might find { Xs = [-2], Ys = [2] } as a counterexample. The neat thing is that the system would then reduce this counterexample to its simplest form, to report { Xs = [0], Ys = [1] } as a failure case. This is useful because it says that the numbers themselves aren’t important, but that the assertion immediately fails when Xs and Ys are different.
The authors don’t go into any detail about how the QuickCheck evaluator itself works (perhaps because they’re going to start a company based on this product). Clearly it’s built on some kind of Prolog-style evaluator. They chose to use Erlang, which might mean that the evaluator uses some kind of random concurrent search strategy, rather than the ordered depth-first search of traditional Prolog (though they don’t claim anything like this in their section on why Erlang was better suited for implementing QuickCheck than Haskell, which was used for the first implementation). The simplification system also sounds like an interesting twist that must happen at the evaluator level. It sounds like they provide a set of primitive ‘generators’ that define a ‘minimization ordering.’ So with the int() generator: 0 < 1 < 2 < ... < n will tell the evaluator to try to backtrack to 0. The list(G) generator might be something like: [] < [min(G)] < [min(G), min(G)] < ...
The 'real life' example, which they give to demonstrate the value of this tool, is also interesting:
The last fault we found arises when a termination is added to a new context; a second termination is added, and then subtracted; a third termination is added and then subtracted; then a fourth termination is added and subtracted. On this final subtract, there is a crash in the proxy. This failing case was particularly interesting because it illustrated the power of shrinking: the original test consisted of over 160 commands, but shrinking reduced it to just these seven. It is also a test that a human tester would hardly be likely to try. The problem was caused by a failure to clean up data-structures correctly on a subtract, leading to an incorrect state after the first subtract, but survivable until the third one.
That sounds pretty promising to me.
PS: One of the authors of this system also wrote a really interesting essay a few years back called Why Functional Programming Matters. It’s a great read for anybody who wonders if this functional stuff is all a bunch of nonsense.


