[racket] Typed Racket reactions on developing polymorphic typed array library, and questions

From: Neil Toronto (neil.toronto at gmail.com)
Date: Mon Aug 23 18:12:52 EDT 2010

Sam Tobin-Hochstadt wrote:
> On Mon, Aug 23, 2010 at 12:31 AM, Neil Toronto <neil.toronto at gmail.com> wrote:
> 
>> This email is feedback on using Typed Racket to develop a new library (i.e.
>> it isn't originally untyped code) that's about 3500 lines long, and some
>> questions.
> 
> Thanks for the feedback!

You're very welcome. BTW, it's actually only 1100 lines of code. There 
was a mredit recovery file (I think that's what they are) in the 
directory when I counted earlier, messing up the total.

>> (foralln/flvectorn:
>>  ([x : Float  (<-flvectorn xs)]
>>  [y : Float  (<-flvectorn ys)])
>>  (for/fold: : Float ([dm : Float  0.0]) ([pt : vec2  (in-list pts)])
>>            (+ dm (delta-mass-from pt (vec2 x y))))))
>>
>> Here, "flvectorn" is an N-dimensional flvector, and "<-flvectorn" is special
>> syntax like "in-list" (i.e. forall recognizes it as creating an array backed
>> by an flvectorn and inlines unsafe-flvectorn-ref accordingly).
> 
> Is this the same as the (brand-new) `for/flvector' and `in-flvector'?

It works similarly, but it's not the same. Those use the sequence 
abstraction; this uses a multidimensional random-access abstraction 
called "arrayn". Also, "forall" and "foralln" semantics are meant to be 
concurrent, so in principle they can always parallelize the work. 
Further, arrays are read/write when their backing models are read/write 
(e.g. vectors).

At first I tried to use the sequence abstraction. But Sequenceof is a 
polymorphic type, so I couldn't get require/typed to import in-vector et 
al. I don't want to use sequences now, though. Arrays should be 
read/write. Destructive update can be useful in interactive programs 
like games. Avoiding allocation reduces GC collects, which reduces 
stutters and hitches.

> In general, `define:' and `let:' require return types since they
> define (potentially) recursive functions, and therefore a type needs
> to be given to the function, which wouldn't be known without
> typechecking the body, which could reference the function.  This is
> also why `for/list:' needs a return type annotation - it's expanding
> into a `let loop' underneath.

Ahhh, okay. My loops destructively update a newly allocated vector (or 
other array model) and iterate over one or more Nonnegative-Fixnum 
indexes. The loops they expand to always have type Nonnegative-Fixnum or 
similar.

It seems the macro system's ignorance of the type system is making 
things difficult. I'm guessing this kind of thing will come up 
frequently in TR programs, and in just a few different ways. It might be 
helpful to catalog them.

>>  - Polymorphic structs being proper subtypes of their parent structs would
>> be really nice. As it is, I can't figure out how to cleanly have a
>> "(vector-array T)" subtype of "(array T)" that gives private array functions
>> access to the vector backing an array. If I had that, I could collapse
>> compositions of certain array ops at runtime and remove some duplicate
>> bounds checks.
> 
> I'm not sure exactly what you mean here.  Can you give an example?

Yep. Here's the array type:

     (define-type Index Nonnegative-Fixnum)
     (define-struct: (T) array ([length : Index]
                                [unsafe-ref : (Index -> T)]
                                [unsafe-set! : (Index T -> Void)]))

(I'm not using struct: because I'm still on a build where it has 
problems with struct-out.) Here's an operation that could be sped up for 
vector-backed arrays. It's a lazy version of map for arrays:

     (: array-map (All (A B) ((A -> B) (array A) -> (array B))))
     (define (array-map f arr)
       (define unsafe-ref (array-unsafe-ref arr))
       (array (array-length arr)
              (lambda: ([i : Index]) (f (unsafe-ref i)))
              (lambda: ([i : Index] [v : B]) (error ...))))

The opportunity is in (f (unsafe-ref i)). When arr is backed by a 
vector, unsafe-ref is a closure (over the vector) that just does an 
unsafe-vector*-ref. If array-map knew that, it could return an array 
with an unsafe-ref that does an unsafe-vector*-ref itself instead of 
deferring to arr's unsafe-ref. But this:

     (define-struct: (T) (vector-array array) ([vec : (Vectorof T)]))

has a predicate with type (Any -> Boolean : (vector-array Any)), so this 
attempt at a faster unsafe-ref won't typecheck:

     (if (vector-array? arr)
         (let ([vec  (vector-array-vec arr)])
           (lambda: ([i : Index]) (f (unsafe-vector*-ref vec i))))
         (let ([unsafe-ref  (array-unsafe-ref arr)])
           (lambda: ([i : Index]) (f (unsafe-ref i)))))

In the true branch, vec has type (Vectorof Any), so the lambda has the 
wrong type.

There are examples involving bounds checks and other things more 
expensive than just extra applications, but they're more complicated. If 
there's an efficient workaround for now, I'd love to know what it is.

> I'll add type annotations for `future' and `touch'.   (lambda () 1)
> should be the same in TR as in regular Racket, so I'm surprised the
> JIT sees them differently.

So was I. Thanks!

>>  - Pie-in-the-sky: notice that "<-flvectorn" above is really just a type
>> annotation that the forall macro understands. It would be nice if
>> compositional macros could get the types of their arguments - I could have
>> forall map over any array-like thing without requiring a declaration of what
>> it is.
>>
>> *snip*
>
> For the particular `<-flvector' case, right now the optimizer makes
> this:
> 
> (for ([i some-vector]) ...)
> 
> just as fast as this:
> 
> (for ([i (in-vector some-vector)]) ...)
> 
> so perhaps the same can be done for `in-flvector' and other such operations.

Neet. I'll look at the latest optimizer and see what I can learn from it.

Thanks!
Neil T



Posted on the users mailing list.