Experiences from and hints for testing

Compare behaviour of tests
Experience shows, that the setup around Test_Some.thy is most convenient. The ~ ~ ~ fun xxx, args ... (or ~ ~ ~ fun xxx , args ...) efficiently allows to replace xxx with the function's identifier and to copy & paste to the Find tool (with hypersearch set to /src/Tools/isac).

If comparing, it appears most efficient, if two versions of the ad-hoc test code are separated to two files, where these files are kept as similar as possible. Then comparing can be done with meld. There are two typical cases for comparing:


 * 1) Compare between similar data: A typical case was around introduction of the constant AA in Partial_Fractions.thy. In such a case copy test code from Test_Some.thy to Test_Some_meld.thy and use meld to manage different terms input to rewriting.
 * 2) Compare between different change sets: Such cases occur, for instance, when introduction of a new feature breaks old tests and the reason for breaking is not evident. In such a case copy Test_Some.thy to another Test_Some.thy running on another change set.

Notes from experience: : (*[7, 4], Met*)  val (p,_,f,nxt,_,pt) = me nxt p [] pt; (*nxt = *) (*[7, 4, 1], Frm*)val (p,_,f,nxt,_,pt) = me nxt p [] pt; (*nxt = *) (*[7, 4, 1], Res*)val (p,_,f,nxt,_,pt) = me nxt p [] pt; (*nxt = *) (*[7, 4, 2], Res*)val (p,_,f,nxt,_,pt) = me nxt p [] pt; (*nxt = *) :
 * Don't spoil the repository with ad-hoc code for testing, so use tmp/Test_Some.thy and tmp/Test_Some_meld.thy.
 * Working on different change sets (in different repos) use tmp/Test_Some.thy and tmp/ Test_Some_REP.thy.
 * If testing involves several changesets worth to be commited, then add time: tmp/yymmdd-Test_Some.thy, tmp/yymmdd-Test_Some_meld.thy, tmp/yymmdd-Test_Some_REP.thy. This kind of file names are sorted by linux as required.
 * Locating differences in long calculations (best investigated by me), show_pt_tac pt is useful. Relation between respective output with the sequence of me-steps is best maintained by positions, and sometimes by the nxt-step:


 * Locating differences in long calculations might be supported by  the  with  by ) = me nxt

: ad-hoc test code : (*[7, 4, 1], Frm*)val (p' ' ' ',_,f,nxt' ' ' '',_,pt' ' ' ') = me nxt' ' ' p' ' ' [] pt' ' '; (*nxt = *) : ad-hoc test code : (*[7, 4, 1], Res*)val (p'v,_,f,nxt'v,_,pt) = me nxt' ' ' ' p' ' ' ' [] pt' ' ' '; (*nxt = *) : ad-hoc test code : (*[7, 4, 2], Res*)val (p,_,f,nxt,_,pt) = me nxt'v p'v [] pt'v; (*nxt = *) :
 * Insertion of ad-hoc test code requires care in naming, e.g. ("'" are separated due to conflict with  ' ' 'bold' ' ' )

val Updated (cs' as (_, _, (ct' ' ', p' ' '))) = (*else*) loc_solve_ (mI,m) ptp
 * Such care also is required if checking results within ad-hoc test code, e.g.


 * TODO

Locate differences between two versions
Sometimes it is difficult to locate differences, because the ad-hoc test code becomes long. And then there is the danger to search at wrong locations in the test code. In order to cope with that danger these means can be helpful:


 * 1) Provide a final test at the end of the test code of the correct version
 * 2) Step into details function by function and provide return values of functions such, that the final test is passed correctly.
 * 3) In case your test involves get_calc, get_pos, upd_calc, upd_ipos such stepping into details is tricky:
 * 4) You display your test code in three windows, in (1) reset_states ; in (2) the (tested) body of the respective function and in (3) the final test to be watched.
 * 5) Before inserting test code from the body of a function out-comment the respective function call above the test code -- this breaks the final test!
 * 6) Insert the body of the function such until the code evaluates correctly to the return value.
 * 7) That the return value is evaluated correctly, that is shown by the final test passing.
 * 8) Only with the final test passed correctly meld to the incorrect version (where the final test is not passed correctly).
 * 9) Step into further details of a function (by going to the body of further functions) on the correct version by watching the final test -- this requires care in watching the error in the incorrect version at the same time (the most frequent cause of running astray in testing).
 * 10) The goal for testing is then to improve the test code such that the final test in the incorrect version pass.
 * 11) Finally transfer the updated code (or data) from test/ to src/.


 * 1) TODO

Locate errors in long "me" sequences
Sometimes errors pop up in long sequences as shown above. If the reasons for the error are unclear, it can be hard to locate the error -- sometimes the cause of the error is several steps above the one which fails. [Here] is an example. The most efficient procedure is as follows:


 * 1) In the repository, say isabisac/, copy the respective test case into Test_Some.thy.
 * 2) In another repository, say isabisacREP/, hg update to a changeset, where this test case works.
 * 3) Copy Test_Some.thy into isabisacREP/
 * 4) In isabisacREP/
 * 5) Check if the test really works.
 * 6) Use show_pt_tac pt to get the whole calculation.
 * 7) Use this calculation to add positions to the "me" sequence, see [here].
 * 8) Add nxt and f to crucial steps.
 * 9) Copy Test_Some.thy from isabisacREP/ back to isabisac/
 * 10) In isabisac/
 * 11) Use show_pt_tac pt to get the whole calculation.
 * 12) Check differences in calculations between isabisac/ and isabisacREP/ by use of a diff-tool, say meld. Carefully look for the first difference occurring in the calculation.
 * 13) Start with investigation at the step with this first difference -- and don't be surprised, if you have go back several me-steps (and use p' ' ' ,nxt' ' '  and pt' ' '  as mentioned above).