[plt-scheme] Re: Expressivity gap between formal methods and PL's

From: wooks (wookiz at hotmail.com)
Date: Sat May 2 10:22:07 EDT 2009

On May 2, 1:47 pm, Shriram Krishnamurthi <s... at cs.brown.edu> wrote:
> Dear wooks,
> Having gone back and re-read the thread, I can see why Paulo has
> difficulty telling what you had in mind, because it keeps slipping and
> sliding.  You've switched from talking about testing to implementing
> back to testing again.  With such slipperiness of expression, you
> either badly need a spec language or you will have great difficulty
> with one, or both. (-:

My theme has consistently been the optimal exploitation of a formal
specification in a software testing context.

Your slipping and sliding is my contrarian view.......(all in good
humour mind)

> If what you want to do is generate tests from formal specifications,
> that's a simpler problem with much better-developed solutions.

I don't want to generate tests from a formal specification.  I thought
that was pretty clear from my immediately preceding post.
I understand it's seems the perfectly natural and obvious thing to
want to do, but my contrarian viewpoint is that I don't want to do it.

> For a
> concrete example, look at Marinov and Khurshid's "TestEra" (which too
> is based on Alloy), and follow pointers backward and forward.

I had a quick scan of the paper - and have not changed my view.

It looks like it is targeted at unit testing which is not the problem
I'm interested in solving.
Black box testing at the method level as they describe in section 8.6
doesn't cut it - it's just a unit test done without relying on the
structure of the code.

They talk about extending TestEra's analysis to report structural code
coverage ....

<Hops on the code coverage soapbox muttering under his breath that
more researchers should read www.exampler.com/testing-com/writings/coverage.pdf>

rm :: symbol -> [symbol] -> [symbol]
;; Program to remove the symbol x from a list (if present) and return
the remaining list
(define rm
   (lambda (x a-list)
         [(empty? a-list] empty]
         [(eq? x (first a-list)) (rest a-list)]
         [else (rm x (rest a-list))])))

The number of tests required for 100% code coverage is 3 (1 per cond
clause), wholly inadequate to test the program properly.
So what useful knowledge about a module does a 100% code coverage
metric give me?
Suppose my metric was 62% - should I worry. Maybe not. The module may
have 38% dead code but there is no automated way of detecting that.

</Hops of soapbox>

I'm interested in using formal specifications to generate test
oracles. Not for testing methods (thats unit testing and not
interesting) but for testing systems.

Posted on the users mailing list.