[racket] Why begin0 doesn't expands to begin ?

From: Gustavo Massaccesi (gustavo at oma.org.ar)
Date: Mon Dec 17 08:39:11 EST 2012

 I saw the source code (for example eval.c). I expected some stranger
optimization for begin0. The only important thing for this discussion
is how it stores multiple values.

I did some benchmarks (There are lies, dammed lies, statistics and
micro-benchmarks.)

I analyze a very simple program, with a lot of set!'s, so it has some
sideefects and makes the optimizer unhappy.
I calculate the time of 100.000.000 iterations and I repeat that 10
times to get the average and standard variation. All the times are
expressed in seconds.
In all these cases the result is 1, and it is finally stored a global variable.

***** I have five versions:
* begin: It uses "begin" instead of "begin0", it is a "different"
program, but it is useful for comparison.
* begin0: It use "begin0" to remember the intermediate result.
* let-temp: It expands the "begin0" and uses a temporal local variable
to remember the intermediate result.
* let-temp-list: It uses a temporal list  to remember the intermediate
result. (In the multiple values cases, this list would store all the
intermediate results.)
* let-temp-vector: It uses a temporal vector to remember the
intermediate result. (In the multiple values case, perhaps it's better
to use a vector than a list.)

[Perhaps it's easier to understand the code than the pseudocode. See below.]

***** Results. (cpu time in seconds +- standard variation)
 Begin:           3,7 +- 0,1
begin0:           5,0 +- 0,1
let-temp-var:     3,7 +- 0,2
let-temp-list:    3,8 +- 0,3
let-temp-vector: 17,5 +- 0,9

***** My interpretation of the results: (I hope it is correct.)
* The "begin" and "let-temp-var" versions have the same duration.
* The "begin0" is 20% slower than the " let-temp-var" version. I think
that here is a missing optimization opportunity.
* The "let-temp-list" and "let-temp-var" versions have the same
duration. Amazing! I suspect that the optimizer is eliminating away
the intermediate list.
* The "let-temp-vector" is much slower than the "let-temp-var"
versions, as expected.

***** Results for the 2-values versions (replaces 1 by (values 1 2) ):
Begin:            5,2 +- 0,0
begin0:          31,9 +- 1,6
let-temp-var:     5,3 +- 0,2
let-temp-list:   60,2 +- 4,0
let-temp-vector: 66,0 +- 1,2

* Now the "begin0" version is much slower than the "let-temp-var" version.
* Now the " let-temp-list" and "let-temp-vector" versions are almost
equivalent. I suspect that the optimizer is doesn't handle this case
in a special way.
*(I expect that for large number of values a vector is better, but ...)

Gustavo

;-------------------

#lang racket
(define xx 0)
(define xxa 0)
(define xxb 0)
(define yy 0)

(define-syntax-rule [tentimes body ...]
  (begin
    (begin body ...)
    (begin body ...)
    (begin body ...)
    (begin body ...)
    (begin body ...)
    (begin body ...)
    (begin body ...)
    (begin body ...)
    (begin body ...)
    (begin body ...)))
[tentimes


 ;;;;;;;;;;;;;;;;;;;;;;;;
 ;;;;;;;; One value cases

 ;"begin"
 (time (for ([i (in-range 0 100000000)])
         (set! xx
               (begin
                 (set! yy 5)
                 1))))
 ;"begin0"
 (time (for ([i (in-range 0 100000000)])
         (set! xx
               (begin0
                 1
                 (set! yy 5)))))

 ;"let-temp-var"
 (time (for ([i (in-range 0 100000000)])
         (set! xx
               (let ([retval 1])
                 (set! yy 5)
                 retval))))

 ;"let-temp-list"
 (time (for ([i (in-range 0 100000000)])
         (set! xx
               (let ([retval (call-with-values (lambda () 1) list)])
                 (set! yy 5)
                 (apply values retval)))))

 ;"let-temp-vector"
 (time (for ([i (in-range 0 100000000)])
         (set! xx
               (let ([retval (call-with-values (lambda () 1) vector)])
                 (set! yy 5)
                 (vector->values retval)))))




 ;;;;;;;;;;;;;;;;;;;;;;;;
 ;;;;;;;; One value cases


 ;"begin"
 (time (for ([i (in-range 0 100000000)])
         (set!-values (xxa xxb)
                      (begin
                        (set! yy 5)
                        (values 1 2)))))

 ;"begin0"
 (time (for ([i (in-range 0 100000000)])
         (set!-values (xxa xxb)
                      (begin0
                        (values 1 2)
                        (set! yy 5)))))


 ;"let-temp-var"
 (time (for ([i (in-range 0 100000000)])
         (set!-values (xxa xxb)
                      (let-values ([(retvala retvalb) (values 1 2)])
                        (set! yy 5)
                        (values retvala retvalb)))))

 ;"let-temp-list"
 (time (for ([i (in-range 0 100000000)])
         (set!-values (xxa xxb)
                      (let ([retval (call-with-values (lambda ()
(values 1 2)) list)])
                        (set! yy 5)
                        (apply values retval)))))

 ;"let-temp-vector"
 (time (for ([i (in-range 0 100000000)])
         (set!-values (xxa xxb)
                      (let ([retval (call-with-values (lambda ()
(values 1 2)) vector)])
                        (set! yy 5)
                        (vector->values retval)))))






 "---"
 ]
xx
xxa
xxb
yy

On Sat, Dec 15, 2012 at 3:57 AM, Eric Dobson <eric.n.dobson at gmail.com> wrote:
> begin0 is for exactly the case you ignore, multiple values. You cannot know
> the number of values the first expression will return statically so if you
> did a translation to begin it would need to allocate them in a list on the
> heap and then use (apply values ...) later.
>
> With begin0 as a primitive you just evaluate the expression, putting its
> return values on the racket stack (the racket bytecode machinery is stack
> based). You then continue with the rest of the expressions, ignoring their
> return values, and then continue with the rest of the program with the
> return values already in place.
>
> Someone who understands the lower levels better than me might be able to
> give a better answer.
>
>
> On Fri, Dec 14, 2012 at 7:06 PM, Gustavo Massaccesi <gustavo at oma.org.ar>
> wrote:
>>
>> The "begin" and "begin0" forms appear in the racket programs, in the
>> fully expanded programs and in the bytecode. I expected that in some
>> intermediate expansion step the "begin0" form expands to a "begin"
>> form.
>>
>> For example something like: (This version doesn't work for multiple
>> return values.)
>>
>> (define-syntax-rule (my-begin0 first rest ...)
>>   (let ([temp first])
>>      rest ...
>>     temp))
>>
>> With this expansion the number of different kinds of operations in the
>> bytecode /virtual machine is smaller, so it would be easier to
>> maintain and optimize.
>>
>> My question is: Is this a bad idea because there is a special
>> optimization for the "begin0" form?
>>
>> Gustavo
>> ____________________
>>   Racket Users list:
>>   http://lists.racket-lang.org/users
>
>

Posted on the users mailing list.