<div dir="ltr">By "selection" I mean applying a selector a struct value. Sorry, a bit too terse, eh....<div> </div><div>As for the result, that looks like some great things for us to put into benchmarks (or maybe the math library itself should be what we should look at).</div> <div> </div><div>I think that the reason the profiler isn't helping you is because the time you care about is being spent in the implementation of the primitives and that's visible only at the C level. If you can run gprof (or Apple's tools), you could probably see more of what's going on. Overall, I feel like some of what you're asking has to do with what TR is doing and some with what's going on inside the procedure chaperones and so I'm not sure the contract library itself is a place where fixes can happen. (But I didn't try to look into what exactly gets put into the chaperones by the contract library to be sure of that -- maybe there is something dumb happening there.) Robby On Fri, Dec 28, 2012 at 12:03 PM, Neil Toronto <<a href="mailto:neil.toronto@gmail.com">neil.toronto@gmail.com</a>> wrote: > > On 12/27/2012 06:21 PM, Robby Findler wrote: >> >> One other place that we realized pretty quickly where we were wrong >> about the boundaries and inefficiencies is struct declarations. It isn't >> uncommon to put contracts on structs and put them in their own module >> (scribble does this). Currently the contract system can't tell that a >> struct contract is actually on a struct and thus avoid checking when >> doing a selection, but maybe that's somewhere that could help. > > > What's a selection? Is it something I could understand by reading a certain paper? > > Because using arrays for examples is pretty abstract and carries a lot of baggage, I've made some independent examples that demonstrate the same inefficiencies. > > The first uses Typed Racket to generate contracts. The type ((Vectorof Index) -> A) corresponds with the type (Array A), and is in fact the type of an array's procedure in `math/array'. > > In the following code, `make-array' makes an unbounded-ish, constant "array". The `array-touch' function retrieves every element from 0 to n-1, and throws them away. > > > #lang racket > > (module typed-defs typed/racket/base > (require racket/fixnum racket/unsafe/ops) > (provide make-array array-touch) > > (: make-array (All (A) (A -> ((Vectorof Index) -> A)))) > (define (make-array e) (lambda (js) e)) > > (: array-touch (All (A) (((Vectorof Index) -> A) Index -> Void))) > (define (array-touch arr-proc n) > (define: js : (Vectorof Index) (vector 0)) > (let: loop : Void ([i : Nonnegative-Fixnum 0]) > (when (i . fx< . n) ; proves i : Index > (unsafe-vector-set! js 0 i) > (arr-proc js) > (loop (fx+ i 1))))) > > ;; Takes 20ms to call `arr-proc' 2 million times > (define arr-proc (make-array 10)) > (time (for ([_ (in-range 200)]) > (array-touch arr-proc 10000)))) > > (require 'typed-defs) > > ;; Takes 3300ms > (define arr-proc (make-array 10)) > (time (for ([_ (in-range 200)]) > (array-touch arr-proc 10000))) > > > Summing up the numbers: > * inside loop: 20ms > * outside loop: 3300ms > > These numbers correspond with what I get using arrays in untyped code, so this example probably isolates the main performance problems. > > Notice that `array-touch' mutates the indexes vector it passes to `arr-proc'. The array library also does this in its nested loops over elements to reduce allocation to one object (the mutated indexes vector). Still, it's worth checking out what would happen if immutable vectors were passed instead. If I'm reading "racket/contract/private/vector.rkt" correctly, this could be faster because immutable vectors are checked immediately instead of getting wrapped in a chaperone. > > Changing (arr-proc js) to (arr-proc (vector->immutable-vector js)): > * inside loop: 100ms > * outside loop: 2950ms > > The "inside loop" timing is what I expected from my earlier experiments that motivated using mutable indexes. The "outside loop" timing suggests the chaperone isn't the biggest performance problem. (Of course, this example doesn't even look at the chaperoned value.) > > Changing (Vectorof Index) to (Vectorof Any): no change > > I think this is because the chaperone checks the contents of `js' only when `js' is indexed - which never happens in this example. > > Changing (Vectorof Index) to Any: > * inside loop: 20ms (no change) > * outside loop: 2200ms > > This suggests that putting any kind of contract on indexes is a big performance hit, but not the biggest. > > There's not a lot left to investigate here without diving into TR's internals, so here's an untyped version that uses contracts: > > > #lang racket > > (module untyped-defs racket/base > (require racket/fixnum racket/unsafe/ops > racket/contract (only-in typed/racket/base index?)) > (provide > (contract-out > [make-array (-> any/c (-> (vectorof index?) any/c))] > [array-touch (-> (-> (vectorof index?) any/c) index? void?)])) > > (define (make-array e) (lambda (js) e)) > > (define (array-touch arr-proc n) > (define js (vector 0)) > (let loop ([i 0]) > (when (i . unsafe-fx< . n) > (unsafe-vector-set! js 0 i) > (arr-proc js) > (loop (unsafe-fx+ i 1))))) > > (define arr-proc (make-array 10)) > (time (for ([_ (in-range 200)]) > (array-touch arr-proc 10000)))) > > (require 'untyped-defs) > > (define arr-proc (make-array 10)) > (time (for ([_ (in-range 200)]) > (array-touch arr-proc 10000))) > > > The timings for the loops in this version are identical to those from the typed version: 20ms inside vs. 3300ms outside. But this is surprising: > > Changing (vectorof index?) to any/c: > * inside loop: 20ms > * outside loop: 1700ms > > Doing the same thing in the typed version reduced the "outside loop" time to 2200ms, not 1700ms. It must not be the same thing after all. What's going on? > > Changing the "array" contract to (-> (vectorof index?) any): > * inside loop: 20ms > * outside loop: 2340ms > > This suggests that checking for multiple values is pretty slow. > > Changing the "array" contract to (-> any/c any): > * inside loop: 20ms > * outside loop: 20ms > > As expected; the procedure can be checked without wrapping. > > FWIW, I would have used the profiler to do this, but it keeps being wrong about where time is spent. It's as if the contract wrappers don't exist. Is there a way to change that? > > Neil ⊥ > </div></div>