[racket-dev] PLAI - Manual management of closures

From: Jay McCarthy (jay.mccarthy at gmail.com)
Date: Wed Jan 4 21:55:09 EST 2012

I've just pushed a new version of the GC language for PLAI that
requires students to manually manage their closures, rather than
treating them as "flat" values.

The code is tested (the entire normal GC test suite is ported) but I
have not yet pushed the documentation (to discourage early use and
confusion from students before we've figured out the best way to
present the issues.)

*** Why ***

In the first language, closures are treated as "flat" values, except
that they aren't really flat so students had to use the
'procedure-roots' function to discover any locations that the closure
held---i.e. its free identifiers. I have always found this really
cheap and I find that students are very confused by when they need to
use this function because it comes out of left field.

In this language, there are a few new functions the student must
implement:

gc:closure : code-ptr (vectorof location) -> location
gc:closure-code-ptr : location -> code-ptr
gc:closure-env-ref : location integer -> location
gc:closure? : location -> boolean

*** Implications ***

Closures represent a totally different kind of data structure for
students. Flats are fixed size with no pointers inside. Conses are
fixed sized with pointers inside. Closures are variable size with
pointers inside. A representation on the heap is obvious to me:

'closure code-ptr n free-var_0 ... free-var_n

but I have no idea how this will affect how students perform on the
assignment.

One benefit is that lots of lame M&S allocation routines simply won't
work and students will have to have something more robust.

*** Issues ***

There is a bit of trouble in the implementation with regards to
primitive functions like add1, sub1. Here are the approaches I thought
to try:

1. Syntactically restrict them to first-order uses.

   (map add1 l) must become (map (lambda (x) (add1 x)) l)

   Pros: No influence on collector/mutator

   Cons: Hacky?, Weird language.

2. Allocate all of them with 'native' code pointers and no free-vars
   at startup

   Pros: Not hacky, each primitive only counts once

   Cons: Every mutator has a very large minimum heap size

3. Allocate them at each use with 'native' code pointers ...

   Pros: A little hacky, Small minimum heap sizes, Pay for what you need

   Cons: Multiple uses count multiple times, Hard to predict minimum
   heap sizes

4. Allocate each a single time but only if they are used

   Pros: Same as 3

   Cons: None of 3, but the implementation is not obvious

Currently I have gone with 3 after doing 1 and finding it
distasteful. Are there options I haven't thought of? What should I
choose? Anyone know how to do 4?

Jay

--
Jay McCarthy <jay.mccarthy at gmail.com>
Assistant Professor / Brigham Young University
http://faculty.cs.byu.edu/~jay

"The glory of God is Intelligence" - D&C 93


Posted on the dev mailing list.