[plt-scheme] redex: leveraging generated checks and coverage
Thanks a lot Casey, this is all very useful. Some more questions/
comments:
>> * What kind of interesting properties can be checked with
>> check-reduction-relation?
>
> check-reduction-relation is most useful for testing that reduction
> preserves some property. For example, you might use it to test the
> following: if an expression e has no free variables, and e reduces to
> an expression e', then e' has no free variables.
>
> check-reduction-relation often works better than redex-check when some
> of your relation's cases are unlikely to apply to a random term. For
> example, (redex-check L e (stays-closed? (term e))) probably won't
> generate many tests that exercise a call-by-value beta rule, because
> matching its left-hand side E[((λ (x ..._1) e) v ..._1)] requires an
> expression in which
>
> 1. there is an application in the evaluation context's hole,
> 2. the application's function position is a lambda,
> 3. the application supplies the correct number of arguments, and
> 4. all the arguments are already values.
>
> (check-reduction-relation R stays-closed?) will get better coverage
> because it extracts the beta rule's left-hand side from the definition
> of stays-closed? and does something like this
>
> (redex-check L (name e (in-hole E ((λ (x ..._1) e) v ..._1)))
[not sure what that 'name' here is]
> (stays-closed? (term e)))
ok, so is it right then that stays-closed? must internally call apply-
reduction-relation,
or does check-reduction-relation work as follows:
1. generate a term (matching the left handside of a rule)
2. take a reduction step
3. apply the property test
?
>> for instance, I would imagine useful to check that all well-formed
>> program
>> reduces to either a value, or a well-defined error. Ie. something
>> like:
>>
>> (check-reduction-relation red
>> (λ (t)
>> (redex-match lang ((store (l_1 v_1) ...) v_2)
>> (apply-reduction-relation* red t)))
>>
>
> For this property you're probably better off with redex-check.
> check-reduction-relation generates tests from the relation's left-hand
> sides. This guarantees that every program it tries takes at least one
> step, which probably isn't what you want here, because it guarantees
> you won't detect programs that are stuck from the get-go.
>
>> Except that means I have to write rules each and every strange
>> program that
>> can be produced.
>> Still, is that good practice? or is that over-specification?
>
> If you want to test this property, I see two options.
>
> 1. Reduce these strange programs to an explicit state that represents
> unspecified behavior. This is what the R6RS formal semantics does.
>
> 2. Write your predicate so that it detects strange programs and lets
> them off the hook without testing that they produce values.
That's exactly what I did in the meantime, and it seems to work.
I have a reduction relation that reduces mal-formed terms (eg. term
where the locations in the store are not unique) to bad terms, and I
have a bad-term predicate.
Then I check as follows:
;; deterministic
(check-reduction-relation
red
(λ (t) (let ((res (apply-reduction-relation* red t)))
(or (eq? 1 (length res))
(bad-term res))))
(I do have to limit the number of attempts otherwise I blow memory...)
>> * I was surprised that using a property that does not call reduction,
>> check-reduction-relation was able to determine some ambiguity in a
>> metafunction (same clause gave possibly different results given the
>> same
>> term)
>> a) how did it get to the metafunction? (my property function was
>> just a
>> print function)
>
> My guess is that one of the relation's cases is guarded by an
> ambiguous metafunction.
>
> [...]
>> b) what kind of testing does it do exactly? (to determine the
>> above, it must
>> have run it several times with the same term, at least)
>
> This test doesn't actually have anything to do with what
> check-reduction-relation tests. Redex always checks that your
> metafunctions are actually (partial) functions.
Well, prior to playing with check-reduction-relation, I built my test
suites and so on, and redex never complained about these ambiguities.
So what do you mean by "Redex always checks..."? when is that check
actually done?
>> * At times, I get:
>> check-reduction-relation: unable to generate pattern <pattern-here>
>> in <x>
>> attempts
>> what am I supposed to learn from that? (from a language design
>> point of
>> view)
>
> My guess is that you have a side-condition somewhere that Redex is
> having trouble satisfying randomly.
> [..]
right, that may definitely be the case
> (define-language M
> (q (side-condition number_1 (= (term number_1) 12345431))))
>
>> (redex-check M q #t)
> redex-check: unable to generate pattern number_1 in 100 attempts
>
> This error message should really include the side-condition part of
> the pattern since it's that part, not number_1, that's actually
> causing the difficulty. I'll fix that.
That'd be better indeed.
> I'm looking into some approaches that will make Redex's test case
> generation better at satisfying arbitrary side-conditions, but that
> will take some work.
>
> For the time being, maybe this message should be a warning instead of
> an error. What do you think, Robby?
maybe useful to have a silent/warning mode.
>> * Finally, I really enjoy the coverage report for a reduction
>> relation.
>> There does not seem to be a similar functionality for
>> metafunctions, though.
>> It would be nice if I could get confidence that my test cases, in
>> addition
>> to testing all reduction rules, actually cover each possible path
>> in the
>> metafunctions (ie. telling me, for each metafunction, how many
>> times each
>> clause has matched).
>
> I'll add a metafunction-coverage parameter that works like relation-
> coverage.
Cool! I look forward to that!
Cheers,
-- Éric