[racket-dev] [racket] tests/eli-tester feedback (Was: Racket unit testing)

From: Eli Barzilay (eli at barzilay.org)
Date: Fri Feb 18 16:12:42 EST 2011

25 minutes ago, Ryan Culpepper wrote:
> On 02/18/2011 07:30 AM, Eli Barzilay wrote:
> > 50 minutes ago, Ryan Culpepper wrote:
> >> On 02/15/2011 07:28 AM, Eli Barzilay wrote:
> >>> And finaly, there's the litmus test for existing code:
> >>>
> >>> * Ryan: is something like this enough to implement the GUI layer?
> >>
> >> Not well, I think. The Test-Result type in Noel's racktest code is
> >> too simple and inflexible. It represents the minimal essence of
> >> testing, but it would be awkward to extend to richer testing
> >> sytems. Here's my counter-proposal for representing the results of
> >> tests:
> >> [...]
> >
> > I can't make sense of it, besides a vague "waaaay to heavy" feeling
> > for something that should be core-ishly minimalistic.
> Simplicity is no good if it gets in the way of representing information 
> that needs to be represented.

[But the flip token is that complexity is no good if you end up with
something that doesn't fit any system, where each one is filling in
fields that it doesn't "want to".]

> > In an attempt to follow it, I did this:
> >
> >    TestResult = header
> >                 execution
> >                 status
> >
> > but your TestHeader is used only there,
> Not necessarily. A testing framework that distinguishes test
> construction from test creation might create the header when the
> test is constructed. SchemeUnit used to work that way, and RackUnit
> is able to, although less gracefully than before.

I don't follow this -- what's the difference between "construction"
and "creation"?

> (See also my final remark, about "test started" notifications.)

Yes, I know that this might imply some division for a sub-struct, I'm
focusing on just the kind of information that is required.

> > so it could be folded in:
> >
> >    TestResult = name      (U String #f)
> >                 suite     (Listof String)
> >                 info      Dictionary
> >                 execution
> >                 status
> >
> > TestExecution is also used only once so it can also be folded in --
> > but since it's just a generic dictionary, it can be dropped.
> I think it's a bad idea to collapse the two dictionaries, because
> they represent different information. Especially since the set of
> keys is open-ended, it is helpful to separate information about the
> test from information about its execution.

(Same here -- I did the collapse to synthesize what it is that you're
actually requiring, so I treated all dictionaries as "other stuff",
which makes them trivially collapsible...)

> > * What happens when there's no specific expected value to compare?
> >   For example, run some two pieces of code 10 times each and check
> >   that the average runtime of the first is below the runtime of
> >   the second.  This could be phrased in terms of an expected
> >   value, but in a superficial way, and will prevent useful
> >   information from being expressed (since the information would
> >   have to be reduced to two numbers).
> You can include whatever information you want. That's why it's a
> dictionary, rather than a fixed set of fields. The real question is
> how a test result displayer will know how to interpret the fields
> correctly.  I think a useful default is to show all attributes with
> keys that are interned symbols or strings. Custom attributes would
> only work for test result displayers that know about them.

The question is if some attributes are known enough to get a special
treatment, and then the whole dictionary thing becomes a burden of
html-like specification rather than an "everything works" advantage.
What I'd like to see, is something along the lines of:

    String x String dictionary of field-name and field contents
    or a single string for the result

This avoids such mess as specifying when I use a string for the
printed form of some value (as you suggested in "Then convert it to a
string and keep the string") vs when it's a proper value.  It also
avoids making semi-formal fields that become de facto requirements.

> > * This solidifies the list-of-strings as a representation of the
> >   test hierarchy.  But perhaps there is no way to avoid this -- if
> >   it's made into a proper hierarchy of objects it will probably
> >   complicate things in a way that requires the listener to get
> >   "update" events that tells it how the structure changed.
> I was actually going to propose something more complicated for the
> hierarchy, but I figured it was better to leave that for later. I'm
> certainly open to changing this part.

The dynamic aspect makes it looks fine as is, I think.  It just seems
redundant to start describing tests accurately to have sections that
have the same name but are realy separate.

> > * I'm not sure about the error result.  It seems to me that this is a
> >    meta issue that you're dealing with when you develop the test suite,
> >    and as such it should be something that you'd deal with in the usual
> >    ways =>  throw an exception.  It's the tools that should be in charge
> >    of catching such an exception and deal with it -- which means that
> >    - in my tester's case, it'll defer to racket as usual, meaning that
> >      you'd just get an error.
> >    - in rackunit's case you'd probably get some report listing the
> >      erroneous tests, instead of propagating the error.
> >    - and in your gui case you'd catch exceptions and show them as error
> >      results.
> Are you saying you think a status should only be success or failure?
> If so, I disagree. I can see roughly how that would work, but I
> think it's useful to distinguish between failure and error at the
> reporting level.

It is -- but the question is whether *that* kind of reporting belongs
in the core specification of these values or not.  Making it be there
seems wrong to me in the same way that exceptions are never really
used for anything other than throwing them.  (Except perhaps a few
weird cases that I'm sure will lead to flames, say add "almost"s or

> >> And that's not quite the end of it. The rackunit gui creates an
> >> entry for a test case as soon as it starts running, so the user
> >> can see what test case is hanging and interrupt it if they
> >> choose. That requires additional communication between test
> >> execution and test display.
> >
> > Yes, that would e part of the protocol for the listener -- and it
> > makes sense to allow tests to invoke it to let it know that a test
> > has started.
> Like maybe sending it just the test-header struct? The part that
> represents the information known about the test before it executes,
> packaged up as one value?
> Although, if we're going to standardize this part it would also be
> nice to have a way of indicating that a suite has started, too.

Yeah -- and that's something that I liked in Noel's list of strings,
it means that you treat test suites in the same way as tests, which
IMO means that it will lead to nice uniformities in other places (like
a gui interface).

          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                    http://barzilay.org/                   Maze is Life!

Posted on the dev mailing list.