[plt-scheme] redex: leveraging generated checks and coverage

From: Eric Tanter (etanter at dcc.uchile.cl)
Date: Thu Sep 17 17:09:45 EDT 2009

Thanks a lot Casey, this is all very useful. Some more questions/ 
comments:

>> * What kind of interesting properties can be checked with
>> check-reduction-relation?
>
> check-reduction-relation is most useful for testing that reduction
> preserves some property. For example, you might use it to test the
> following: if an expression e has no free variables, and e reduces to
> an expression e', then e' has no free variables.
>
> check-reduction-relation often works better than redex-check when some
> of your relation's cases are unlikely to apply to a random term. For
> example, (redex-check L e (stays-closed? (term e))) probably won't
> generate many tests that exercise a call-by-value beta rule, because
> matching its left-hand side E[((λ (x ..._1) e) v ..._1)] requires an
> expression in which
>
> 1. there is an application in the evaluation context's hole,
> 2. the application's function position is a lambda,
> 3. the application supplies the correct number of arguments, and
> 4. all the arguments are already values.
>
> (check-reduction-relation R stays-closed?) will get better coverage
> because it extracts the beta rule's left-hand side from the definition
> of stays-closed? and does something like this
>
> (redex-check L (name e (in-hole E ((λ (x ..._1) e) v ..._1)))

[not sure what that 'name' here is]

> (stays-closed? (term e)))

ok, so is it right then that stays-closed? must internally call apply- 
reduction-relation,
or does check-reduction-relation work as follows:
1. generate a term (matching the left handside of a rule)
2. take a reduction step
3. apply the property test

?

>> for instance, I would imagine useful to check that all well-formed  
>> program
>> reduces to either a value, or a well-defined error. Ie. something  
>> like:
>>
>> (check-reduction-relation red
>> (λ (t)
>>  (redex-match lang ((store (l_1 v_1) ...) v_2)
>>               (apply-reduction-relation* red t)))
>>
>
> For this property you're probably better off with redex-check.
> check-reduction-relation generates tests from the relation's left-hand
> sides. This guarantees that every program it tries takes at least one
> step, which probably isn't what you want here, because it guarantees
> you won't detect programs that are stuck from the get-go.
>
>> Except that means I have to write rules each and every strange  
>> program that
>> can be produced.
>> Still, is that good practice? or is that over-specification?
>
> If you want to test this property, I see two options.
>
> 1. Reduce these strange programs to an explicit state that represents
> unspecified behavior. This is what the R6RS formal semantics does.
>
> 2. Write your predicate so that it detects strange programs and lets
> them off the hook without testing that they produce values.

That's exactly what I did in the meantime, and it seems to work.

I have a reduction relation that reduces mal-formed terms (eg. term  
where the locations in the store are not unique) to bad terms, and I  
have a bad-term predicate.

Then I check as follows:

;; deterministic
(check-reduction-relation
  red
  (λ (t) (let ((res (apply-reduction-relation* red t)))
           (or (eq? 1 (length res))
               (bad-term res))))

(I do have to limit the number of attempts otherwise I blow memory...)


>> * I was surprised that using a property that does not call reduction,
>> check-reduction-relation was able to determine some ambiguity in a
>> metafunction (same clause gave possibly different results given the  
>> same
>> term)
>> a) how did it get to the metafunction? (my property function was  
>> just a
>> print function)
>
> My guess is that one of the relation's cases is guarded by an
> ambiguous metafunction.
>
> [...]
>> b) what kind of testing does it do exactly? (to determine the  
>> above, it must
>> have run it several times with the same term, at least)
>
> This test doesn't actually have anything to do with what
> check-reduction-relation tests. Redex always checks that your
> metafunctions are actually (partial) functions.

Well, prior to playing with check-reduction-relation, I built my test  
suites and so on, and redex never complained about these ambiguities.  
So what do you mean by "Redex always checks..."? when is that check  
actually done?

>> * At times, I get:
>> check-reduction-relation: unable to generate pattern <pattern-here>  
>> in <x>
>> attempts
>> what am I supposed to learn from that? (from a language design  
>> point of
>> view)
>
> My guess is that you have a side-condition somewhere that Redex is
> having trouble satisfying randomly.
> [..]

right, that may definitely be the case

> (define-language M
>  (q (side-condition number_1 (= (term number_1) 12345431))))
>
>> (redex-check M q #t)
> redex-check: unable to generate pattern number_1 in 100 attempts
>
> This error message should really include the side-condition part of
> the pattern since it's that part, not number_1, that's actually
> causing the difficulty. I'll fix that.

That'd be better indeed.

> I'm looking into some approaches that will make Redex's test case
> generation better at satisfying arbitrary side-conditions, but that
> will take some work.
>
> For the time being, maybe this message should be a warning instead of
> an error. What do you think, Robby?

maybe useful to have a silent/warning mode.

>> * Finally, I really enjoy the coverage report for a reduction  
>> relation.
>> There does not seem to be a similar functionality for  
>> metafunctions, though.
>> It would be nice if I could get confidence that my test cases, in  
>> addition
>> to testing all reduction rules, actually cover each possible path  
>> in the
>> metafunctions (ie. telling me, for each metafunction, how many  
>> times each
>> clause has matched).
>
> I'll add a metafunction-coverage parameter that works like relation- 
> coverage.

Cool! I look forward to that!


Cheers,

-- Éric

Posted on the users mailing list.