[racket] testing student programs

From: Nadeem Abdul Hamid (nadeem at acm.org)
Date: Sat Oct 16 15:00:21 EDT 2010

I don't think it should be that difficult once you get an evaluator
set up. I've done something like what you want, taking some ideas from
the handin server code. What I came up with (no fancy macros) is you
define a test specification for an assignment like this (in a #lang
racket file):

;; this is my implemented solution of an exercise,
;; to test against students':
(define (total-profit x)
 ...blah blah blah....)

(define the-assignment
 (assignment "Homework 2"
             '(htdp intermediate)
              (problem "2a. Movie Theater Profit"
                         (proc total-profit 1)
                         (type (total-profit 5) "a number" ,number?)
                         (test (total-profit 0) -20)
                         (test (total-profit 4) ,(total-profit 4))  ;
<-- note unquote
                         (test (total-profit 10) ,(total-profit 10))
   .... more problems ...

then run the program and it loads and tests students' files against
the specification to produce an output text file like this:

ASSIGNMENT: Homework 2
 Language: (htdp intermediate)
 Passed 46 out of 48 tests.

PROBLEM: 2a. Movie Theater Profit
 Passed 8 out of 8 tests.
  PASS: File name matches 'hw2-movie.rkt'?
  PASS: File evaluated without error?
  PASS: File ran without timeout?
  PASS: Is 'total-profit' defined as a function of 1 parameter?
  PASS: Does (total-profit 5) produce a number?
  PASS: Does (total-profit 0) produce an expected result?
  PASS: Does (total-profit 4) produce an expected result?
  PASS: Does (total-profit 10) produce an expected result?

... etc. ...

>From my experience, BSL files are somewhat of a pain to test in this
way, because definitions are processed as syntax, so I came up with a
hack to override the language of BSL files and load them in ISL mode

I'll attach a few files of mine that hopefully you can adapt for your
   *(I've actually sent the files separately
     to Todd; if anyone else wants, I'll be
     happy to send them individually.)
eval2.rkt is the stuff for setting up an evaluator (given
source code as a stream of bytes); checker2.rkt is the stuff for
checking an evaluator against assignment specifications such as the
one above; and csc-auto-check.rkt is the main script for checking
student subdirectories, and it also provides a simple gui interface to
choose the assignment spec and student directory to check against.
I've also attached a complete homework assignment specification file.
There are probably some rough edges here and there in this code, and
certainly some additional stuff could be done to make it more useable,
less tedious to write test specifications, etc., but it works to some
degree. I tried to get coverage tests working, and succeeded to some
extent, but not completely, so that is disabled in the code now, which
causes some failures in the test suite. The checker does provide
simple timeout functionality (in case student code has an infinite
loop), and handles the case when an input file has syntax errors (all
tests automatically fail), and there is a flag in the output
indicating that the file did not evaluate without errors.



On Sat, Oct 16, 2010 at 2:16 PM, Todd O'Bryan <toddobryan at gmail.com> wrote:
> I know this has come up on the list before, and I've reread those
> threads but am little confused.
> Here's a sample student program file:
> ------------------------------------------------------------------------------------------
> ; volume-of-solid: number number number -> number
> ; given the length, width, and height of a rectangular prism,
> ;   produces the volume
> (define (volume-of-solid length width height)
>  (* length width height))
> (check-expect (volume-of-solid 2 3 4) 24)
> (check-expect (volume-of-solid 3 5 7) 105)
> --------------------------------------------------------------------------------------------
> I'd like to test this file with something like:
> ---------------------------------------------------------------------------------------------
> #lang racket
> (define score 0)
> (when (= (volume-of-solid 3 4 5) 60)
>  (set! score (add1 score)))
> (when (= (volume-of-solid 10 5 4) 200)
>  (set! score (add1 score)))
> ---------------------------------------------------------------------------------------------
> but I can't figure out how to do it safely.
> It seems like if I use make-module-evaluator, I'm stuck in the context
> of the original student program--that is, Beginning Student Language,
> without the ability to accumulate a score or use constructs that
> aren't defined in BSL. If I provide the student functions in full
> Racket, I don't get the safety of the sandbox, and the student code
> could do something not nice to my test system.
> Obviously, there's a place here for a really nice macro-based testing
> harness that checks for errors in each student function call, lets you
> assign points for each test, etc., but I have to figure out how to get
> the definitions I want to test safely into a context that lets me
> write the code to evaluate them.
> Thanks in advance,
> Todd
> _________________________________________________
>  For list-related administrative tasks:
>  http://lists.racket-lang.org/listinfo/users

Posted on the users mailing list.