Experiences from and hints for testing

Compare behaviour of tests
Experience shows, that the setup around Test_Some.thy is most convenient. The ~ ~ ~ fun xxx, args ... (or ~ ~ ~ fun xxx , args ...) efficiently allows to replace xxx with the function's identifier and to copy & paste to the Find tool (with hypersearch set to /src/Tools/isac).

If comparing, it appears most efficient, if two versions of the ad-hoc test code are separated to two files, where these files are kept as similar as possible. Then comparing can be done with meld. There are two typical cases for comparing:


 * 1) Compare between similar data: A typical case was around introduction of the constant AA in Partial_Fractions.thy. In such a case copy test code from Test_Some.thy to Test_Some_meld.thy and use meld to manage different terms input to rewriting.
 * 2) Compare between different change sets: Such cases occur, for instance, when introduction of a new feature breaks old tests and the reason for breaking is not evident. In such a case copy Test_Some.thy to another Test_Some.thy running on another change set.

Notes from experience: : (*[7, 4], Met*)  val (p,_,f,nxt,_,pt) = me nxt p [] pt; (*nxt = *) (*[7, 4, 1], Frm*)val (p,_,f,nxt,_,pt) = me nxt p [] pt; (*nxt = *) (*[7, 4, 1], Res*)val (p,_,f,nxt,_,pt) = me nxt p [] pt; (*nxt = *) (*[7, 4, 2], Res*)val (p,_,f,nxt,_,pt) = me nxt p [] pt; (*nxt = *) :
 * Don't spoil the repository with ad-hoc code for testing, so use tmp/Test_Some.thy and tmp/Test_Some_meld.thy.
 * Working on different change sets (in different repos) use tmp/Test_Some.thy and tmp/ Test_Some_REP.thy.
 * If testing involves several changesets worth to be commited, then add time: tmp/yymmdd-Test_Some.thy, tmp/yymmdd-Test_Some_meld.thy, tmp/yymmdd-Test_Some_REP.thy. This kind of file names are sorted by linux as required.
 * Locating differences in long calculations (best investigated by me), show_pt_tac pt is useful. Relation between respective output with the sequence of me-steps is best maintained by positions, and sometimes by the nxt-step:


 * Locating differences in long calculations might be supported by  the  with  by ) = me nxt

: ad-hoc test code : (*[7, 4, 1], Frm*)val (p' ' ' ',_,f,nxt' ' ' '',_,pt' ' ' ') = me nxt' ' ' p' ' ' [] pt' ' '; (*nxt = *) : ad-hoc test code : (*[7, 4, 1], Res*)val (p'v,_,f,nxt'v,_,pt) = me nxt' ' ' ' p' ' ' ' [] pt' ' ' '; (*nxt = *) : ad-hoc test code : (*[7, 4, 2], Res*)val (p,_,f,nxt,_,pt) = me nxt'v p'v [] pt'v; (*nxt = *) :
 * Insertion of ad-hoc test code requires care in naming, e.g. ("'" are separated due to conflict with  ' ' 'bold' ' ' )

val Updated (cs' as (_, _, (ct' ' ', p' ' '))) = (*else*) loc_solve_ (mI,m) ptp
 * Such care also is required if checking results within ad-hoc test code, e.g.


 * TODO

Test involving non-functional states
In case your test involves get_calc, get_pos, upd_calc, upd_ipos such stepping into details is tricky:
 * 1) Display your test code in three windows, in (1) reset_states ; in (2) the (tested) body of the respective function and in (3) a final test to be watched.
 * 2) reset_states ; must be evaluated at each edit in (2) in order to keep the final test passing.

So this kind of tests must be kept to a minimum.

Locate differences between two versions
Sometimes it is difficult to locate differences, because the ad-hoc test code becomes long. And then there is the danger to search at wrong locations in the test code. In order to cope with that danger these means can be helpful:


 * 1) Provide a final test at the end of the test code of the correct version
 * 2) Step into details function by function and provide return values of functions such, that the final test is passed correctly.
 * 3) In case your test involves get_calc, get_pos, upd_calc, upd_ipos such stepping into details is tricky:
 * 4) You display your test code in three windows according to handling state-ful tests.
 * 5) Before inserting test code from the body of a function out-comment the respective function call above the test code -- this breaks the final test!
 * 6) Insert the body of the function such until the code evaluates correctly to the return value.
 * 7) That the return value is evaluated correctly, that is shown by the final test passing.
 * 8) Only with the final test passed correctly create analogous code in the error-version, i.e. copy the analogous (different) code from src/ (here the final test is not passed correctly).
 * 9) Step into further details of a function (by going to the body of further functions) such that the final test keeps working:
 * 10) On the correct version introduce p' ' ' , pt' ' '  etc according to this hint.
 * 11) Insert (the beginning of) the respective function's body (to the location where you expect the error).
 * 12) In case you reach the end of the body, insert "~ ~ ~ to xxx return val:"; val ( ... ) = ( ... );
 * 13) Replace the pt' ' '  etc by the original identifiers, such that the test code is as close to the code in src/. This step is tricky, depending on the code!!!
 * 14) Iterate by going back to the previous step until you have located the error.
 * So, the goal for testing is then to improve the test code such that the final test in the incorrect version passes.
 * 1) Finally transfer the updated code (or data) from test/ to src/.


 * 1) TODO

Locate errors in long "me" sequences
Sometimes errors pop up in long sequences as shown above. If the reasons for the error are unclear, it can be hard to locate the error -- sometimes the cause of the error is several steps above the one which fails. [Here] is an example. The most efficient procedure is as follows:


 * 1) In the repository, say isabisac/, copy the respective test case into Test_Some.thy.
 * 2) In another repository, say isabisacREP/, hg update to a changeset, where this test case works.
 * 3) Copy Test_Some.thy into isabisacREP/
 * 4) In isabisacREP/
 * 5) Check if the test really works.
 * 6) Use show_pt_tac pt to get the whole calculation.
 * 7) Use this calculation to add positions to the "me" sequence, see [here].
 * 8) Add nxt and f to crucial steps.
 * 9) Copy Test_Some.thy from isabisacREP/ back to isabisac/
 * 10) In isabisac/
 * 11) Use show_pt_tac pt to get the whole calculation.
 * 12) Check differences in calculations between isabisac/ and isabisacREP/ by use of a diff-tool, say meld. Carefully look for the first difference occurring in the calculation.
 * 13) Start with investigation at the step with this first difference -- and don't be surprised, if you have go back several me-steps (and use p' ' ' ,nxt' ' '  and pt' ' '  as mentioned above).

Prepare a test environment
1. Take a well-tried test, provide a final test and determine a relevant step: val fmz = ["equality (x+1=(2::real))", "solveFor x","solutions L"]; val (dI',pI',mI') = ("Test", ["sqroot-test","univariate","equation","test"],   ["Test","squ-equ-test-subpbl1"]); val (p,_,f,nxt,_,pt) = CalcTreeTEST [(fmz, (dI',pI',mI'))]; val (p,_,f,nxt,_,pt) = me nxt p [] pt; : (*// relevant call --\\*) (*[1], Res* )val (p,_,f,nxt,_,pt') =*) me nxt p [] pt; (*nxt = Rewrite_Set "Test_simplify"*) (* ^^^^^^^^^^^^^^^^^--- is replaced by ...... *) : (* here go down to respective function call *) (*\\-- end of modified "fun me" -//*) (*[2], Res*)val (p,_,f,nxt,_,pt) = me nxt p [] pt; : (*[], Res*)val (p,_,f,nxt,_,pt) = me nxt p [] pt; (* final test ...*)' if p = ([], Res) andalso f2str f = "[x = 1]" andalso fst nxt = "End_Proof'"  andalso pr_ctree pr_short pt =   ".    - pblobj -\n" ^   "1.   x + 1 = 2\n" ^   "2.   x + 1 + -1 * 2 = 0\n" ^   "3.    - pblobj -\n" ^   "3.1.   -1 + x = 0\n" ^   "3.2.   x = 0 + -1 * -1\n" ^   "4.   [x = 1]\n" then  else error "re-build: fun locate_input_tactic changed";

2. Go down to respective function call, top down always pairing ~ ~ ~ fun yyy, args:"; val (..bbb..) = (..aaa..) with ~ ~ ~ from yyy to xxx return:"; val (..ddd..) = (..ccc..); we show the state after some iterations some levels deeper at fun locate_input_tactic: (*val (msg, cs') =*) Solve.solve m (pt, pos); "~ ~ ~ fun solve, args:"; val ((_, m), (pt, po as (p, p_))) = (m, (pt, pos)); :       : (* body of solve down to the call of locate_input_tactic *) : › ML ‹ (*//--- old version ...*) val xxx as LucinNEW.Steps ((_, ss as (tac_, _, pt', p', c') :: _)) = (*case*) locate_input_tactic (thy', srls) m (pt, (p, p_)) (sc, d) is (*of*); (*... old version ...//*) › ML ‹ (*//--- old version ...*) "~ ~ ~ from locate_input_tactic to solve return:"; val (Steps (_, ss as (_, _, pt', p', c') :: _) ) = xxx; (*... old version ---//*) :       : (* body of solve down to return value *) : ("ok", (map step2taci ss, c', (pt', p')));

3. Copy & paste the body of locate_input_tactic until you get the relevant return value; then this replaces the test-call and keeps the final test working: (*val (msg, cs') =*) Solve.solve m (pt, pos); "~ ~ ~ fun solve, args:"; val ((_, m), (pt, po as (p, p_))) = (m, (pt, pos)); :       : (* body of solve down to the call of locate_input_tactic *) : › ML ‹ (*//--- old version ...*) (*****val xxx as LucinNEW.Steps ((_, ss as (tac_, _, pt', p', c') :: _)) =*****) (*case*) locate_input_tactic (thy', srls) m (pt, (p, p_)) (sc, d) is (*of*); (*... old version ...//*) › ML ‹ :          :  (* body of locate_input_tactic down to return value: *) (Steps (Selem.ScrState is, ss)); › ML ‹ (*//--- old version ...*) "~ ~ ~ from locate_input_tactic to solve return:"; val (Steps (_, ss as (_, _, pt', p', c') :: _) ) = (Steps (Selem.ScrState is, ss)); (*... old version ---//*) :       : (* body of solve down to return value *) : ("ok", (map step2taci ss, c', (pt', p')));