Experiences from and hints for testing

Compare behaviour of tests
Experience shows, that the setup around Test_Some.thy is most convenient. The ~ ~ ~ fun xxx, args ... (or ~ ~ ~ fun xxx , args ...) efficiently allows to replace xxx with the function's identifier and to copy & paste to the Find tool (with hypersearch set to /src/Tools/isac).

If comparing, it appears most efficient, if two versions of the ad-hoc test code are separated to two files, where these files are kept as similar as possible. Then comparing can be done with meld. There are two typical cases for comparing:


 * 1) Compare between similar data: A typical case was around introduction of the constant AA in Partial_Fractions.thy. In such a case copy test code from Test_Some.thy to Test_Some_meld.thy and use meld to manage different terms input to rewriting.
 * 2) Compare between different change sets: Such cases occur, for instance, when introduction of a new feature breaks old tests and the reason for breaking is not evident. In such a case copy Test_Some.thy to another Test_Some.thy running on another change set.

Notes from experience: : (*[7, 4], Met*)  val (p,_,f,nxt,_,pt) = me nxt p [] pt; (*nxt = *) (*[7, 4, 1], Frm*)val (p,_,f,nxt,_,pt) = me nxt p [] pt; (*nxt = *) (*[7, 4, 1], Res*)val (p,_,f,nxt,_,pt) = me nxt p [] pt; (*nxt = *) (*[7, 4, 2], Res*)val (p,_,f,nxt,_,pt) = me nxt p [] pt; (*nxt = *) :
 * Don't spoil the repository with ad-hoc code for testing, so use tmp/Test_Some.thy and tmp/Test_Some_meld.thy.
 * Working on different change sets (in different repos) use tmp/Test_Some.thy and tmp/ Test_Some_REP.thy.
 * If testing involves several changesets worth to be commited, then add time: tmp/yymmdd-Test_Some.thy, tmp/yymmdd-Test_Some_meld.thy, tmp/yymmdd-Test_Some_REP.thy. This kind of file names are sorted by linux as required.
 * Locating differences in long calculations (best investigated by me), show_pt_tac pt is useful. Relation between respective output with the sequence of me-steps is best maintained by positions, and sometimes by the nxt-step:


 * Locating differences in long calculations might be supported by  the  with  by ) = me nxt

: ad-hoc test code : (*[7, 4, 1], Frm*)val (p' ' ' ',_,f,nxt' ' ' '',_,pt' ' ' ') = me nxt' ' ' p' ' ' [] pt' ' '; (*nxt = *) : ad-hoc test code : (*[7, 4, 1], Res*)val (p'v,_,f,nxt'v,_,pt) = me nxt' ' ' ' p' ' ' ' [] pt' ' ' '; (*nxt = *) : ad-hoc test code : (*[7, 4, 2], Res*)val (p,_,f,nxt,_,pt) = me nxt'v p'v [] pt'v; (*nxt = *) :
 * Insertion of ad-hoc test code requires care in naming, e.g. ("'" are separated due to conflict with  ' ' 'bold' ' ' )

val Updated (cs' as (_, _, (ct' ' ', p' ' '))) = (*else*) loc_solve_ (mI,m) ptp
 * Such care also is required if checking results within ad-hoc test code, e.g.


 * TODO

Test involving non-functional states
In case your test involves get_calc, get_pos, upd_calc, upd_ipos such stepping into details is tricky:
 * 1) Display your test code in three windows, in (1) reset_states ; in (2) the (tested) body of the respective function and in (3) a final test to be watched.
 * 2) reset_states ; must be evaluated at each edit in (2) in order to keep the final test passing.

So this kind of tests must be kept to a minimum.

Locate differences between two versions
Sometimes it is difficult to locate differences, because the ad-hoc test code becomes long. And then there is the danger to search at wrong locations in the test code. In order to cope with that danger these means can be helpful:


 * 1) Provide a final test at the end of the test code of the correct version
 * 2) Step into details function by function and provide return values of functions such, that the final test is passed correctly.
 * 3) In case your test involves get_calc, get_pos, upd_calc, upd_ipos such stepping into details is tricky:
 * 4) You display your test code in three windows according to handling state-ful tests.
 * 5) Before inserting test code from the body of a function out-comment the respective function call above the test code -- this breaks the final test!
 * 6) Insert the body of the function such until the code evaluates correctly to the return value.
 * 7) That the return value is evaluated correctly, that is shown by the final test passing.
 * 8) Only with the final test passed correctly create analogous code in the error-version, i.e. copy the analogous (different) code from src/ (here the final test is not passed correctly).
 * 9) Step into further details of a function (by going to the body of further functions) such that the final test keeps working:
 * 10) On the correct version introduce p' ' ' , pt' ' '  etc according to this hint.
 * 11) Insert (the beginning of) the respective function's body (to the location where you expect the error).
 * 12) In case you reach the end of the body, insert "~ ~ ~ to xxx return val:"; val ( ... ) = ( ... );
 * 13) Replace the pt' ' '  etc by the original identifiers, such that the test code is as close to the code in src/. This step is tricky, depending on the code!!!
 * 14) Iterate by going back to the previous step until you have located the error.
 * So, the goal for testing is then to improve the test code such that the final test in the incorrect version passes.
 * 1) Finally transfer the updated code (or data) from test/ to src/.


 * 1) TODO

Locate errors in long "me" sequences
Sometimes errors pop up in long sequences as shown above. If the reasons for the error are unclear, it can be hard to locate the error -- sometimes the cause of the error is several steps above the one which fails. [Here] is an example. The most efficient procedure is as follows:


 * 1) In the repository, say isabisac/, copy the respective test case into Test_Some.thy.
 * 2) In another repository, say isabisacREP/, hg update to a changeset, where this test case works.
 * 3) Copy Test_Some.thy into isabisacREP/
 * 4) In isabisacREP/
 * 5) Check if the test really works.
 * 6) Use show_pt_tac pt to get the whole calculation.
 * 7) Use this calculation to add positions to the "me" sequence, see [here].
 * 8) Add nxt and f to crucial steps.
 * 9) Copy Test_Some.thy from isabisacREP/ back to isabisac/
 * 10) In isabisac/
 * 11) Use show_pt_tac pt to get the whole calculation.
 * 12) Check differences in calculations between isabisac/ and isabisacREP/ by use of a diff-tool, say meld. Carefully look for the first difference occurring in the calculation.
 * 13) Start with investigation at the step with this first difference -- and don't be surprised, if you have go back several me-steps (and use xxxxx,xxxxx_x etc as mentioned in  the previous section).

Replace a fun's arguments + body + 1 relevant call with new code
Test_Some.thy is prepared for this kind of development.

Prepare a test environment
val fmz = ["equality (x+1=(2::real))", "solveFor x","solutions L"]; val (dI',pI',mI') = ("Test", ["sqroot-test","univariate","equation","test"],   ["Test","squ-equ-test-subpbl1"]); val (p,_,f,nxt,_,pt) = CalcTreeTEST [(fmz, (dI',pI',mI'))]; val (p,_,f,nxt,_,pt) = me nxt p [] pt; : (*// relevant call --\\*) (*[1], Res*)val (p,_,f,nxt,_,pt) = me nxt p [] pt; (*nxt = Rewrite_Set "Test_simplify"*) (* --- *)   :    : (* here go down to respective function call *) : (*\\-- end of modified "fun me" -//*) (*[2], Res*)val (p,_,f,nxt,_,pt) = me nxt p [] pt; : (*[], Res*)val (p,_,f,nxt,_,pt) = me nxt p [] pt; (* final test ...*) if p = ([], Res) andalso f2str f = "[x = 1]" andalso fst nxt = "End_Proof'" andalso pr_ctree pr_short pt = ".   - pblobj -\n" ^ "1.  x + 1 = 2\n" ^ "2.  x + 1 + -1 * 2 = 0\n" ^ "3.   - pblobj -\n" ^ "3.1.  -1 + x = 0\n" ^ "3.2.  x = 0 + -1 * -1\n" ^ "4.  [x = 1]\n" then else error "re-build: fun locate_input_tactic changed";
 * (5.1.1) Take a well-tried test-case of a relevant part of a calculation, provide a final test and determine a relevant step in the test-case:

(*[1], Res*)val xxxx (***(p,_,f,nxt,_,pt)***) = me nxt p [] pt; (*nxt = Rewrite_Set "Test_simplify"*) (* --- *) "~ ~ ~ fun me, args:"; val ((_, tac), p, _(*NEW remove*), pt) = (nxt, p, [], pt); :   : (* here insert the function body of fun me *) : "~ ~ ~ from me to TOOPLEVEL return:"; val (nxt, p, pt) = xxxx; (*\\-- end of modified "fun me" -//*) (*[1], Res*)val (p,_,f,nxt,_,pt) = me nxt p [] pt; (*nxt = Rewrite_Set "Test_simplify"*) :
 * (5.1.2) Go down to the respective function call, top down always pairing ~ ~ ~ fun yyy, args:"; val (..bbb..) = (..aaa..) with ~ ~ ~ from yyy to xxx return:"; val (..ddd..) = (..ccc..). At each step down, while inserting test code for a function, use the the old function definition in order to keep the final test going:

(*[1], Res*)(***val (p,_,f,nxt,_,pt) =***) me nxt p [] pt; (*nxt = Rewrite_Set "Test_simplify"*) (* --- *) "~ ~ ~ fun me, args:"; val ((_, tac), p, _(*NEW remove*), pt) = (nxt, p, [], pt); :   : (* here insert the function body of fun me *) : (p, [] : NEW, TESTg_form (pt, p), (Tac.tac2IDstr tac, tac), Selem.Sundef, pt); (*return value*) "~ ~ ~ from me to TOOPLEVEL return:"; '''val (nxt, p, pt) = (p, [] : NEW, TESTg_form (pt, p), (Tac.tac2IDstr tac, tac), Selem.Sundef, pt);''' (*\\-- end of modified "fun me" -//*) (*[1], Res*)val (p,_,f,nxt,_,pt) = me nxt p [] pt; (*nxt = Rewrite_Set "Test_simplify"*) :
 * (5.1.3) Active the test-code by removing the call as soon as the body of the function is complete down to the return value:

"~ ~ ~ fun solve, args:"; val ((_, m), (pt, po as (p, p_))) = (m, (pt, pos)); :       : (* body of solve down to the call of locate_input_tactic *) : › ML ‹ (*OLD..*)val xxxx as LucinNEW.Steps ((_, ss as (tac_, _, pt', p', c') :: _)) = (*case*) LucinNEW.locate_input_tactic (thy', srls) m (pt, (p, p_)) (sc, d) is (*of*); "//OLD~ ~ ~ fun locate_input_tactic, args:"; val ((thy', srls), m, (pt, p), 	   (scr as Rule.Prog sc, d), (Selem.ScrState (E,l,a,v,S,b), ctxt)) = ((thy', srls), m, (pt, (p, p_)), (sc, d), is); › text ‹ :          : (*here comes OLD body of locate_input_tactic*) : › ML ‹ "\\OLD~ ~ ~ from locate_input_tactic to solve return:"; val (LucinNEW.Steps (_, ss as (_, _, pt', p', c') :: _) ) = xxxx (***(Steps Selem.ScrState is, ss));***)       :        : (* OLD body of solve down to return value *)        : (*OLD solve..*)("ok", (map step2taci ss, c', (pt', p'))); "~ ~ ~ from solve to loc_solve_ return:"; val ((msg, cs' : calcstate'))       = (("ok", (map step2taci ss, c', (pt', p'))));     :
 * (5.1.4) The final state befor updating fun locate_input_tactic is as follows (only the test code of this function is bypassed by xxxx):


 * (5.1.5) Check, that all bypassing calls (by xxxx) up to fun me are removed --- otherwise the test-code for fun locate_input_tactic is not fully activated!

Develop new code for a specific function
Take your time and complete the test environment, i.e. detail the function to be updated and the relevant call down to the respective function bodies; this time spent rewards with less troubles later on.

: › ML ‹ (*//OLD..*)val xxxx as Steps ((_, ss as (tac_, _, pt', p', c') :: _)) =(*..OLD..*) (*case*) locate_input_tactic (thy', srls) m (pt, (p, p_)) (sc, d) is (*of*); (*..OLD*) › text ‹ "OLD~ ~ ~ fun locate_input_tactic, args:"; val ((thy', srls), m, (pt, p), (scr as Rule.Prog sc, d), (Selem.ScrState (E,l,a,v,S,b), ctxt)) = ((thy', srls), m, (pt, (p, p_)), (sc, d), is); :            :  (* body of locate_input_tactic *) (*OLD..*)(Steps (Selem.ScrState is, ss)); (* return value *) › ML ‹ "\\OLD~ ~ ~ from locate_input_tactic to solve return:"; val (Steps (_, ss as (_, _, pt', p', c') :: _) ) = xxxx (***(Steps Selem.ScrState is, ss))***);       :        : (* body of solve down to return value *)
 * (5.2.1) Go back to step (2) in the previous section in order to keep the final test running. Using the old function locate_input_tactic keeps the final test running while implementing the respective function body. Keep the old function body as text, you might compare old and new version.

"OLD~ ~ ~ fun locate_input_tactic, args:"; val ((thy', srls), m, (pt, p), (scr as Rule.Prog sc, d), (Selem.ScrState (E,l,a,v,S,b), ctxt)) = ((thy', srls), m, (pt, (p, p_)), (sc, d), is); :            :  (* body of locate_input_tactic *) (*OLD..*)(Steps (Selem.ScrState is, ss)); (* return value *) › ML ‹ : (* if required, code creating the actual arguments *) "//NEW~ ~ ~ fun locate_input_tactic, args:"; val (progr, cstate, istate, ctxt, tac) = (sc, (pt, (p, p_)), fst is, snd is, m) :          :  (* NEW body of locate_input_tactic *) (*\\NEW..*)Safe_Step ((ctree, pos'), istate, get_ctxt ctree pos'); (* return value *) › ML ‹ "\\NEW~ ~ ~ from locate_input_tactic to solve return:"; val (Safe_Step (cstate, istate, ctxt)) = (Safe_Step ((ctree, pos'), istate, get_ctxt ctree pos')); › ML ‹ "\\OLD~ ~ ~ from locate_input_tactic to solve return:"; val (Steps (_, ss as (_, _, pt', p', c') :: _) ) = xxxx (***(Steps Selem.ScrState is, ss))***);
 * (5.2.2) Add ~ ~ ~ fun locate_input_tactic according to the new signature (which should not be too different from the old one ..) and, if required, add code creating the actual arguments for the calling function solve. Note that the final test keeps running according to Pt.(4):

"~ ~ ~ fun solve, args:"; val ((_, m), (pt, po as (p, p_))) = (m, (pt, pos)); :       : (* if required, code creating the actual arguments *) "//NEW~ ~ ~ fun locate_input_tactic, args:"; val (progr, cstate, istate, ctxt, tac) = (sc, (pt, (p, p_)), fst is, snd is, m)          : : (* NEW body of locate_input_tactic *) (*\\NEW..*)Safe_Step ((ctree, pos'), istate, get_ctxt ctree pos'); (* return value *) › ML ‹ "\\NEW~ ~ ~ from locate_input_tactic to solve return:"; val (TODO) = (Safe_Step ((ctree, pos'), istate, get_ctxt ctree pos')); › ML ‹ "1~ ~ ~ from solve to loc_solve_ return:"; val ((msg, cs')) = xxxxx (***(("ok", (map step2taci ss, c', (pt', p'))))***); : › ML ‹
 * (5.2.3) Embed the new body of locate_input_tactic into the calling function solve. While updating code of solve use the old function (of solve) to keep the final test running.


 * (5.2.4) Transfer the new code for the function body to a new definition of fun locate_input_tactic, which is located in Test_Some.thy above the test-case with the final test, which should not break. This code can be tested without breaking the final test, as soon as the NEW fun locate_input_tactic is embedded in the calling fun solve.

Embed the new function into a specific calling function
"~ ~ ~ fun solve, args:"; val ((_, m), (pt, po as (p, p_))) = (m, (pt, pos)); :       : (* body of solve down to the call of locate_input_tactic *) :          :  (* NEW body of locate_input_tactic *) :       :        : (* NEW body of solve handling return value of locate_input_tactic *) : "~ ~ ~ from solve to loc_solve_ return:"; val ((msg, cs' : calcstate')) = xxxxx (*** )(("ok", cs' (*** )(map step2taci ss, c', (pt', p'))( ***))); :   :    (***val xxxxx as (msg, cs') = Solve.***)'''solve m (pt, pos); "~ ~ ~ fun solve, args:"; val ((_, m), (pt, po as (p, p_))) = (m, (pt, pos)); :       : (* body of solve down to the call of locate_input_tactic *) :          :  (* NEW body of locate_input_tactic *) :       :        : (* NEW body of solve handling return value of locate_input_tactic *) : (*NEW return from solve..*)val cs' = ([(tac_2tac tac, tac, (e_pos', (istate, ctxt)))], [(*ctree NOT cut*)], cstate) : calcstate' "~ ~ ~ from solve to loc_solve_ return:"; val ((msg, cs' : calcstate')) (***= xxxxx ( ***) (("ok", cs' (*** )(map step2taci ss, c', (pt', p'))( ***)));   :
 * (5.3.1) Develop new code for embedding the call of fun locate_input_tactic; you keep the final test running by use of the old function (i.e.loc_solve_) calling the caller (i.e.solve):
 * (5.3.2) Check the new code by removing the old function call:

(*NEW..*) val xxx =  locate_input_tactic sc (pt, po) (fst is) (snd is) m;       : › ML ‹ "//NEW~ ~ ~ fun locate_input_tactic, args:"; val (progr, cstate, istate, ctxt, tac) = (sc, (pt, po), fst is, snd is, m)          : : (* NEW body of locate_input_tactic *) : (*NEW..*)Safe_Step ((ctree, pos'), istate, get_ctxt ctree pos'); (*return value*) "\\NEW~ ~ ~ from locate_input_tactic to solve return:"; val (Safe_Step (cstate, istate, ctxt)) = xxx  (***Safe_Step ((ctree, pos'), istate, get_ctxt ctree pos'***);        :
 * (5.3.3) Now the new definition of fun locate_input_tactic can be checked:


 * (5.3.4) Transfer the new code handling the return value from the test-case to a new definition of the calling function (i.e. solve); this new definition must be between fun locate_input_tactic and the test-case. The final test should not break.


 * (5.3.5) Check the new definition of fun solve analogously to fun solve (see Pt.(10)) in the test-case.

Extend the special case to general functionality
Section '[Experiences_from_and_hints_for_testing#Develop_new_code_for_a_specific_function]' only covered one special case, which needs generalisation:


 * (5.4.1) Extend the fun's code to all possible return values in the *.sml file.


 * (5.4.2) Update all calls of the fun in src/.


 * (5.4.3) Now Build_Isac.thy should work without errors.


 * (5.4.4) Transfer both functions, fun locate_input_tactic and solve from Test_Some.thy to the respective *.sml files and add identifiers of signatures as required (all modules are opened in Test_Some.thy).


 * (5.4.5) Adapt all calls of fun locate_input_tactic in /src/ to the new signature.


 * (5.4.6) Now the code is ready to start with Test_Isac.thy.

Adapt all Isac tests to the new code
New code as described above requires specific attention in Isac's tests, i.a. all files called by Test_Isac.thy.
 * (5.5.1) Copy the test-case from Test_Some.thy to an appropriate file in /test/.


 * (5.5.2) Observe Test_Isac_Short.thy and look at tests, which do not detail the new code, for instance in ~ ~ ~ fun locate_input_tactic: there is still the old code, which still works the old way. Absence of errors in the other tests indicates good functionality of the new code.


 * (5.5.3) In case of errors in tests, which do not detail the new code, look at these errors first; they indicate deficient functionality of the new code, which needs repair first


 * (5.5.4) After clarification of the error cases (2) turn to all occurrences of ~ ~ ~ fun locate_input_tactic and ~ ~ ~ fun solve. All these occurrences must get the code from the new function bodies.


 * (5.5.5) TODO


 * (5.5.6) Shift the code of the test-case in Test_Some.thy to another appropriate file in /test/


 * (5.5.7) TODO