[racket] Typed Racket reactions on developing polymorphic typed array library, and questions
Sam Tobin-Hochstadt wrote:
> On Mon, Aug 23, 2010 at 12:31 AM, Neil Toronto <neil.toronto at gmail.com> wrote:
>
>> This email is feedback on using Typed Racket to develop a new library (i.e.
>> it isn't originally untyped code) that's about 3500 lines long, and some
>> questions.
>
> Thanks for the feedback!
You're very welcome. BTW, it's actually only 1100 lines of code. There
was a mredit recovery file (I think that's what they are) in the
directory when I counted earlier, messing up the total.
>> (foralln/flvectorn:
>> ([x : Float (<-flvectorn xs)]
>> [y : Float (<-flvectorn ys)])
>> (for/fold: : Float ([dm : Float 0.0]) ([pt : vec2 (in-list pts)])
>> (+ dm (delta-mass-from pt (vec2 x y))))))
>>
>> Here, "flvectorn" is an N-dimensional flvector, and "<-flvectorn" is special
>> syntax like "in-list" (i.e. forall recognizes it as creating an array backed
>> by an flvectorn and inlines unsafe-flvectorn-ref accordingly).
>
> Is this the same as the (brand-new) `for/flvector' and `in-flvector'?
It works similarly, but it's not the same. Those use the sequence
abstraction; this uses a multidimensional random-access abstraction
called "arrayn". Also, "forall" and "foralln" semantics are meant to be
concurrent, so in principle they can always parallelize the work.
Further, arrays are read/write when their backing models are read/write
(e.g. vectors).
At first I tried to use the sequence abstraction. But Sequenceof is a
polymorphic type, so I couldn't get require/typed to import in-vector et
al. I don't want to use sequences now, though. Arrays should be
read/write. Destructive update can be useful in interactive programs
like games. Avoiding allocation reduces GC collects, which reduces
stutters and hitches.
> In general, `define:' and `let:' require return types since they
> define (potentially) recursive functions, and therefore a type needs
> to be given to the function, which wouldn't be known without
> typechecking the body, which could reference the function. This is
> also why `for/list:' needs a return type annotation - it's expanding
> into a `let loop' underneath.
Ahhh, okay. My loops destructively update a newly allocated vector (or
other array model) and iterate over one or more Nonnegative-Fixnum
indexes. The loops they expand to always have type Nonnegative-Fixnum or
similar.
It seems the macro system's ignorance of the type system is making
things difficult. I'm guessing this kind of thing will come up
frequently in TR programs, and in just a few different ways. It might be
helpful to catalog them.
>> - Polymorphic structs being proper subtypes of their parent structs would
>> be really nice. As it is, I can't figure out how to cleanly have a
>> "(vector-array T)" subtype of "(array T)" that gives private array functions
>> access to the vector backing an array. If I had that, I could collapse
>> compositions of certain array ops at runtime and remove some duplicate
>> bounds checks.
>
> I'm not sure exactly what you mean here. Can you give an example?
Yep. Here's the array type:
(define-type Index Nonnegative-Fixnum)
(define-struct: (T) array ([length : Index]
[unsafe-ref : (Index -> T)]
[unsafe-set! : (Index T -> Void)]))
(I'm not using struct: because I'm still on a build where it has
problems with struct-out.) Here's an operation that could be sped up for
vector-backed arrays. It's a lazy version of map for arrays:
(: array-map (All (A B) ((A -> B) (array A) -> (array B))))
(define (array-map f arr)
(define unsafe-ref (array-unsafe-ref arr))
(array (array-length arr)
(lambda: ([i : Index]) (f (unsafe-ref i)))
(lambda: ([i : Index] [v : B]) (error ...))))
The opportunity is in (f (unsafe-ref i)). When arr is backed by a
vector, unsafe-ref is a closure (over the vector) that just does an
unsafe-vector*-ref. If array-map knew that, it could return an array
with an unsafe-ref that does an unsafe-vector*-ref itself instead of
deferring to arr's unsafe-ref. But this:
(define-struct: (T) (vector-array array) ([vec : (Vectorof T)]))
has a predicate with type (Any -> Boolean : (vector-array Any)), so this
attempt at a faster unsafe-ref won't typecheck:
(if (vector-array? arr)
(let ([vec (vector-array-vec arr)])
(lambda: ([i : Index]) (f (unsafe-vector*-ref vec i))))
(let ([unsafe-ref (array-unsafe-ref arr)])
(lambda: ([i : Index]) (f (unsafe-ref i)))))
In the true branch, vec has type (Vectorof Any), so the lambda has the
wrong type.
There are examples involving bounds checks and other things more
expensive than just extra applications, but they're more complicated. If
there's an efficient workaround for now, I'd love to know what it is.
> I'll add type annotations for `future' and `touch'. (lambda () 1)
> should be the same in TR as in regular Racket, so I'm surprised the
> JIT sees them differently.
So was I. Thanks!
>> - Pie-in-the-sky: notice that "<-flvectorn" above is really just a type
>> annotation that the forall macro understands. It would be nice if
>> compositional macros could get the types of their arguments - I could have
>> forall map over any array-like thing without requiring a declaration of what
>> it is.
>>
>> *snip*
>
> For the particular `<-flvector' case, right now the optimizer makes
> this:
>
> (for ([i some-vector]) ...)
>
> just as fast as this:
>
> (for ([i (in-vector some-vector)]) ...)
>
> so perhaps the same can be done for `in-flvector' and other such operations.
Neet. I'll look at the latest optimizer and see what I can learn from it.
Thanks!
Neil T