[plt-scheme] redex: leveraging generated checks and coverage

From: Casey Klein (clklein at eecs.northwestern.edu)
Date: Wed Sep 16 20:59:07 EDT 2009

On Wed, Sep 16, 2009 at 5:42 PM, Eric Tanter <etanter at dcc.uchile.cl> wrote:
> Hi,
> I would like to understand better how I can most interestingly use the
> generative checks supported by PLT Redex. This is a fairly messy email, with
> several related points. I'd appreciate any comment/feedback on them.
> * What kind of interesting properties can be checked with
> check-reduction-relation?

check-reduction-relation is most useful for testing that reduction
preserves some property. For example, you might use it to test the
following: if an expression e has no free variables, and e reduces to
an expression e', then e' has no free variables.

check-reduction-relation often works better than redex-check when some
of your relation's cases are unlikely to apply to a random term. For
example, (redex-check L e (stays-closed? (term e))) probably won't
generate many tests that exercise a call-by-value beta rule, because
matching its left-hand side E[((λ (x ..._1) e) v ..._1)] requires an
expression in which

1. there is an application in the evaluation context's hole,
2. the application's function position is a lambda,
3. the application supplies the correct number of arguments, and
4. all the arguments are already values.

(check-reduction-relation R stays-closed?) will get better coverage
because it extracts the beta rule's left-hand side from the definition
of stays-closed? and does something like this

(redex-check L (name e (in-hole E ((λ (x ..._1) e) v ..._1)))
(stays-closed? (term e)))

> for instance, I would imagine useful to check that all well-formed program
> reduces to either a value, or a well-defined error. Ie. something like:
> (check-reduction-relation red
> (λ (t)
>  (redex-match lang ((store (l_1 v_1) ...) v_2)
>               (apply-reduction-relation* red t)))

For this property you're probably better off with redex-check.
check-reduction-relation generates tests from the relation's left-hand
sides. This guarantees that every program it tries takes at least one
step, which probably isn't what you want here, because it guarantees
you won't detect programs that are stuck from the get-go.

> Except that means I have to write rules each and every strange program that
> can be produced.
> Still, is that good practice? or is that over-specification?

If you want to test this property, I see two options.

1. Reduce these strange programs to an explicit state that represents
unspecified behavior. This is what the R6RS formal semantics does.

2. Write your predicate so that it detects strange programs and lets
them off the hook without testing that they produce values.

> * I was surprised that using a property that does not call reduction,
> check-reduction-relation was able to determine some ambiguity in a
> metafunction (same clause gave possibly different results given the same
> term)
> a) how did it get to the metafunction? (my property function was just a
> print function)

My guess is that one of the relation's cases is guarded by an
ambiguous metafunction.

(define-language L
  (n number))

(define R
    (n ...)
    (n ...)
    (side-condition (zero? (term (ambiguous n ...)))))))

(define-metafunction L
  [(ambiguous) 0]
  [(ambiguous n_0 ... n_i n_i+1 ...) n_i])

> (check-reduction-relation R (curry printf "~s\n"))
ambiguous: clause #1 (counting from 0) matched (ambiguous 0 1 3) 3
different ways and returned different results

> b) what kind of testing does it do exactly? (to determine the above, it must
> have run it several times with the same term, at least)

This test doesn't actually have anything to do with what
check-reduction-relation tests. Redex always checks that your
metafunctions are actually (partial) functions.

> (term (ambiguous 0 1 3))
ambiguous: clause #1 (counting from 0) matched (ambiguous 0 1 3) 3
different ways and returned different results

> All in all, I guess I don't understand completely what is tested here
> (assuming the property function does not do anything)
> * At times, I get:
> check-reduction-relation: unable to generate pattern <pattern-here> in <x>
> attempts
> what am I supposed to learn from that? (from a language design point of
> view)

My guess is that you have a side-condition somewhere that Redex is
having trouble satisfying randomly.

(define-language M
  (q (side-condition number_1 (= (term number_1) 12345431))))

> (redex-check M q #t)
redex-check: unable to generate pattern number_1 in 100 attempts

This error message should really include the side-condition part of
the pattern since it's that part, not number_1, that's actually
causing the difficulty. I'll fix that.

I'm looking into some approaches that will make Redex's test case
generation better at satisfying arbitrary side-conditions, but that
will take some work.

For the time being, maybe this message should be a warning instead of
an error. What do you think, Robby?

> * Finally, I really enjoy the coverage report for a reduction relation.
> There does not seem to be a similar functionality for metafunctions, though.
> It would be nice if I could get confidence that my test cases, in addition
> to testing all reduction rules, actually cover each possible path in the
> metafunctions (ie. telling me, for each metafunction, how many times each
> clause has matched).

I'll add a metafunction-coverage parameter that works like relation-coverage.

Let me know if you have more questions, comments, or requests!

Posted on the users mailing list.