[racket] Performance of delay/force vs lambda/apply
Two days ago, Adolfo Pérez Álvarez wrote:
> Hi everyone,
>
> While playing around with delay/force I got some unexpected results.
> It seems that the performance of delay/force is far worse than that
> of an equivalent construction based on lambda/apply.
>
> For example, this implementation of the tak function:
> [...]
> takes around 6 seconds to calculate (g 242501) on my machine:
> > (time (g 242501))
> cpu time: 5963 real time: 6005 gc time: 3980
> 242502
The numbers that I'm getting for this (each is the median of 7 runs)
are:
delay: cpu time: 3850 real time: 3874 gc time: 1997
lazy: cpu time: 2240 real time: 2253 gc time: 1363
delay/name: cpu time: 790 real time: 794 gc time: 470
naive: cpu time: 945 real time: 951 gc time: 477
thunks: cpu time: 392 real time: 393 gc time: 155
Note that the overall speedup for thunks is much bigger -- that's
probably due to better inlining on the recent git head.
The thing is that the default `delay' deals with a bunch of stuff that
makes things slower compared to a naive implementation. One of them
is dealing with multiple values -- `lazy' doesn't do that which makes
it a bit faster. This is a substantial performance hit, but the
biggest one which both `delay' and `lazy' suffer from is catching
errors and caching them as a result. For example, try forcing this a
few times:
(define foo (let ([c 0]) (delay (begin (set! c (+ c 1))
(/ 1 (- c 1))))))
with `delay' or `lazy' and compare with the naive implementation.
The cost is because each promise has its computation inside a
`with-handlers' (roughly), and I don't know of a good way to make it
faster. That's not because I didn't try... I would love to have it
run faster, since just that thing is responsible for a 2x slowup --
here are numbers from the same code without the exception catching:
no-exn-delay: cpu time: 2520 real time: 2532 gc time: 1211
no-exn-lazy: cpu time: 1407 real time: 1415 gc time: 789
and in lazy racket this becomes a major issue...
> Simplifying even further the definition of mdelay/mforce:
>
> (define-syntax-rule (mdelay expr) (thunk expr))
> (define-syntax-rule (mforce expr) (expr))
>
> got me impressive results (comparable to a C implementation of the
> same function):
> > (time (g 242501))
> cpu time: 597 real time: 602 gc time: 319
> 242502
>
> Given that the compiler is so good optimizing function creation and
> application, ¿Is there any reason to use delay/force over
> lambda/apply? ¿Am I missing something here?
One thing that will remain consistent is that this kind of
implementation will always be faster since there's enough baggage in
delay/force that it's unlikely to be something that racket could
optimize through. The simple thunks approach, OTOH, are things can
are likely to be optimized.
--
((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay:
http://barzilay.org/ Maze is Life!