[racket] tests/eli-tester feedback (Was: Racket unit testing)

From: Eli Barzilay (eli at barzilay.org)
Date: Sun Feb 13 12:16:59 EST 2011

50 minutes ago, Robby Findler wrote:
> On Sun, Feb 13, 2011 at 9:45 AM, Stefan Schmiedl <s at xss.de> wrote:
> > On Sun, 13 Feb 2011 07:27:35 -0600
> > Robby Findler <robby at eecs.northwestern.edu> wrote:
> >>
> >> And there is also a third unit test framework that Eli wrote that
> >> takes the position that it should be minimal, punting things like
> >> test suites into Racket itself (by using functions, say). I'm not
> >> sure if that last one is included in the documentation.

Not documented, and not distributed by default (since it still lives
as just a file in the `tests' collection).  I do have a bunch of
revisions that I've put into it to improve things considerably, so
once I get that and have a resolution to some of the issues (like the
one below, and also the fact that it's currently "too careful" in that
it catches too many errors) I'll make it into a proper library.


> > I went and looked around a bit. Is this what you're referring to?
> >
> > (require tests/eli-tester)
> >
> > (test
> >  #t
> >  (< 1 2)
> >  (+ 1 2) => 3
> >  (quotient/remainder 10 3) => (values 3 1)
> >  (car '()) =error> "expects argument of type")
> >
> > Very compact and avoids the problem of "what comes first" that I
> > usually have with other frameworks :-)
> 
> Yes, that's the one, although Eli has promised that the first and
> second subexpressions in the above won't be valid syntax in a
> hopefully soon-to-come revision. (The problem being that if you
> forget to put an => in, then you turn one failing test case into two
> passing ones.)

Just to clarify the two problems: the first is if you want to write:

  (test (fib 10) => 55)

but instead you write:

  (test (fib 10) 55)

then you end up with a bogusly successful test suit that only checks
that (fib 10) and that 55 are expressions that evaluate to a non-#f
result.  So as things currently look, the `=>' is going to be required
-or- there's a single expression to test for a non-#f result, and
you'll use a nested `test' expression for those non-#f things.  This
still makes things less convenient for using random predicates, but
not much.  For example, a `fib' test suite that can currently look
like this:

  (test (exact-nonnegative-integer? (fib 10))
        (fib 10) => 55)

would instead be written as:

  (test (test (exact-nonnegative-integer? (fib 10)))
        (fib 10) => 55)

This makes it a bit less convenient, but not too much.  An alternative
would be to flip things around -- having `=>' only get a special
meaning if it's used as (test E1 => E2), and otherwise have only
simple tests -- so the above becomes:

  (test (exact-nonnegative-integer? (fib 10))
        (test (fib 10) => 55))

and the last `test' expression is similar to just an `equal?' except
that it produces a more readable error message.

The second issue is that currently it catches almost *all* errors,
including syntax errors.  A good example for a bogus result that Robby
got once is:

  (test (+ x 1) => x)

where `x' is unbound -- it will catch the syntax error on both sides,
and since the error messages match, it will conclude that the test is
successful.  The solution here is to avoid making it catch errors
unless you use an explicit `=error>' arrow, and even then catch only
runtime error and add another arrow for syntax errors (perhaps
`=syntax=error>').

Finally, there's another question that is still unclear.  I want to
make it easily extensible, and I'm not sure how to do that.  Two
things that this should do is:

a. Add new arrow types.  For example, add a new `=output>' arrow that
   verifies the output of the tested expression.

b. Add some way to wrap all evaluations, so, for example, you can
   create a sandbox and have all evaluations happen inside it, so you
   can do things like:
     (test (some-loop) =error> "out of memory")
   which are only possible if the expressions are running inside a
   sandbox.

The revisions that I have do the first of these -- you can define new
kinds of arrows as macros (and it's still a tiny library), but I'm not
sure about the second.  Just having a wrapper is easy (and perhaps
I'll do just that), but when you deal with these things it might be
desirable to sometimes jump out of the sandbox and check things
outside, and that might be inconvenient with such a simple wrapping
facility.

In any case, any feedback on those questions will be good -- feel free
to mail me directly how you'd prefer this to look.

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                    http://barzilay.org/                   Maze is Life!



Posted on the users mailing list.