[racket] Small and short-lived data structures

From: Matthias Felleisen (matthias at ccs.neu.edu)
Date: Thu Oct 24 13:40:49 EDT 2013

1. micro-benchmarks tell you nothing about this problem. 

2. running something 10 times tells you little more. 

3. running it on one architecture in one ENV tells you little more. 

We have an internal reading list on performance evaluation but I think
it is now appropriate to list this: 

   http://www.ccs.neu.edu/racket/Performance/

[I know Sam knows all this because he's my student, and I simply 
expressed my frustration that he'd write a message that is so totally
inconsistent with his knowledge.]

-- Matthias




On Oct 24, 2013, at 6:44 AM, Gustavo Massaccesi <gustavo at oma.org.ar> wrote:

> Strange results. I did some benchmarks.
> The 'mcons solution use always the same mcon to return the results of
> all the function calls. It’s different from the 'cons solution that
> returns a different con in each function call. I made a new version:
> loop4 (see code bellow).
> 
> I ran the program 10 times, with different values of K = 10^5, 10^6,
> 10^7 and 10^8. This table has the cpu time: average ± stdev, ordered
> by K=1E8.
> 
> Method           K=1E5        K=1E6          K=1E7           K=1E8
> '0-direct         7.8±8.2    137.5± 9.7    1434.4±35.2    13940.4±59.2
> '3-mcons-one      9.5±8.2    151.4±14.8    1424.9± 9.7    14262.9±56.1
> '4-mcons-diff    10.9±7.5    151.7±13.1    1442.2±10.4    14474.8±44.5
> '1-cons          12.5±6.6    145.3± 7.6    1509.3±39.7    14582.7±72.1
> '2-values         9.4±8.1    167.2± 7.7    1610.9±11.5    16117.4±47.8
> 
> With K=1E8 the results differences are significant. The faster
> versions are the '0-direct and '3-mcons-one (old version). They don’t
> allocate a new con or mcon in each cicle. The slowest versions is
> '2-values, in my experience values are very slow in Racket.
> 
> I’m still surprised that '4-mcons-diff is faster than '1-cons. The
> code is identical and the only difference is that one uses mcon and
> the other uses con.
> 
> Gustavo
> 
> ;-------------------------
> (begin-encourage-inline
> (define (f4 x y)
>   (mcons (+ x y) (- x y)))
> 
> (define (g4 x)
>   (+ (unsafe-mcar x) (unsafe-mcdr x))))
> 
> (define (loop4 n)
>  (for/sum ([i (in-range n)])
>    (g4 (f4 i (unsafe-fx+ 1 i)))))
> ;-------------------------
> 
> On Mon, Oct 21, 2013 at 3:43 PM, Sam Tobin-Hochstadt
> <samth at cs.indiana.edu> wrote:
>> I encourage you to find some evidence showing that Konrad's assumption
>> is mistaken, then.
>> 
>> Sam
>> 
>> On Mon, Oct 21, 2013 at 2:38 PM, Matthias Felleisen
>> <matthias at ccs.neu.edu> wrote:
>>> 
>>> WIth respect, I think this shows nothing :-)
>>> 
>>> 
>>> 
>>> On Oct 21, 2013, at 2:35 PM, Sam Tobin-Hochstadt <samth at cs.indiana.edu> wrote:
>>> 
>>>> On Mon, Oct 21, 2013 at 12:47 PM, Matthias Felleisen
>>>> <matthias at ccs.neu.edu> wrote:
>>>>> 
>>>>> In general, it is a mistaken assumption however that even in a
>>>>> 'managed' language -- such as Java or Racket -- mutation is still
>>>>> faster than de/allocation of short-lived data structures. If you
>>>>> choose to go with 'managed' language, you need to adapt your
>>>>> programming style to the new context.
>>>> 
>>>> Sadly, this assumption is often right for Racket today, even though we
>>>> have a pretty good allocator.  Here's a simple benchmark:
>>>> https://gist.github.com/samth/7088570
>>>> 
>>>> This just passes multiple arguments between two functions in three
>>>> ways: with a cons, with multiple values, and by mutating an mcons
>>>> (plus a direct version with two local variables).  In all cases, it's
>>>> as fast as I can make it, using unsafe operations where appropriate.
>>>> 
>>>> Here are some representative timings:
>>>> 
>>>> [samth at huor:~/sw/plt/racket/collects/racket/match (master) plt] r /tmp/bench.rkt
>>>> 'direct
>>>> cpu time: 40 real time: 44 gc time: 0
>>>> 'cons
>>>> cpu time: 72 real time: 73 gc time: 12
>>>> 'values
>>>> cpu time: 192 real time: 191 gc time: 0
>>>> 'mcons
>>>> cpu time: 48 real time: 51 gc time: 0
>>>> 
>>>> Direct is always fastest, multiple values are very slow, and cons is
>>>> always slower than mcons.  Even if we ignore GC time, the functional
>>>> version is still always slower.
>>>> 
>>>> Sam
>>> 
>> ____________________
>>  Racket Users list:
>>  http://lists.racket-lang.org/users



Posted on the users mailing list.