[plt-scheme] V301.5 Speed Up

From: Williams, M. Douglas (M.DOUGLAS.WILLIAMS at saic.com)
Date: Fri Feb 10 18:10:03 EST 2006

I timed (or attempted to) the simulation collection for two different
models.  These are based on just 10 runs each.  Please take these with the
caveats mentioned by several of the previous responders.  However, I would
certainly bank on simulations developed using the simulation collection
running in V301.5 as running significantly faster in V301.5 than in V301.

The first (model-2a.ss) is a combined discrete and continuous simulation and
is both computationally intensive (the continuous part) and continuation
intensive (the discrete part).  There is about a 2x speedup - real times
from ~15000 (V301) to ~7500 (V301.5).

The second model (open-loop.ss) is a pure discrete event simulation.  This
showed even more improvement.  There is about a 4.5x speedup - real times
from ~73000 (V301) to ~15500 (V301.5).  The garbage collection seems to be
the killer in this one for V301.  [You can also see the problem with the
negative cpu times here.  I tried several times and never got 10 runs that
were all 'good'.  The real time and gc time seem to be okay.]

Does anyone know if the time-apply bug always produces negative numbers when
it breaks, or are those just the obvious ones?

Example code from open-loop.ss

(run-simulation 100 1000)
(collect-garbage)

(require (planet "statistics.ss" ("williams" "science.plt")))
(let* ((n 10)
       (cpu-times (make-vector n))
       (real-times (make-vector n))
       (gc-times (make-vector n)))
  (do ((i 0 (+ i 1)))
      ((= i n)(void))
    (collect-garbage)
    (let-values (((result cpu real gc)
                  (time-apply run-simulation '(100 1000))))
      (vector-set! cpu-times i cpu)
      (vector-set! real-times i real)
      (vector-set! gc-times i gc)))
  (do ((i 0 (+ i 1)))
      ((= i n) (void))
    (printf "run ~a: cpu = ~a real = ~a gc = ~a~n"
            i (vector-ref cpu-times i)
            (vector-ref real-times i)
            (vector-ref gc-times i)))
  (printf "CPU:   mean = ~a, variance = ~a~n"
          (mean cpu-times) (variance cpu-times))
  (printf "Real:  mean = ~a, variance = ~a~n"
          (mean real-times) (variance real-times))
  (printf "GC:    mean = ~a, variance = ~a~n"
          (mean gc-times) (variance gc-times)))
  
Data

model-2a.ss
V301
run 0: cpu = 7971 real = 8013 gc = 881
run 1: cpu = 7100 real = 7162 gc = 620
run 2: cpu = 7141 real = 7152 gc = 612
run 3: cpu = 8092 real = 8113 gc = 881
run 4: cpu = 7591 real = 7723 gc = 620
run 5: cpu = 7541 real = 7562 gc = 621
run 6: cpu = 7962 real = 8013 gc = 903
run 7: cpu = 6830 real = 6841 gc = 311
run 8: cpu = 7420 real = 7432 gc = 621
run 9: cpu = 7180 real = 7251 gc = 320
CPU:   mean = 7482.8, variance = 182148.17777777778
Real:  mean = 7526.2, variance = 187229.95555555553
GC:    mean = 638.9999999999999, variance = 44252.0

V301.5
run 0: cpu = 17696 real = 17829 gc = 4016
run 1: cpu = 15051 real = 15064 gc = 2185
run 2: cpu = 14661 real = 14843 gc = 1794
run 3: cpu = 15583 real = 15585 gc = 1412
run 4: cpu = 14821 real = 14944 gc = 1032
run 5: cpu = 15793 real = 15815 gc = 1452
run 6: cpu = 15793 real = 15805 gc = 1062
run 7: cpu = 14381 real = 14393 gc = 852
run 8: cpu = 14792 real = 14923 gc = 852
run 9: cpu = 14841 real = 14853 gc = 852
CPU:   mean = 15341.199999999999, variance = 922566.4
Real:  mean = 15405.4, variance = 938010.2666666667
GC:    mean = 1550.9, variance = 948959.6555555558
----------
open-loop.ss
V301
run 0: cpu = -355100 real = 74959 gc = 54767
run 1: cpu = 74146 real = 74579 gc = 54699
run 2: cpu = 73926 real = 74489 gc = 54420
run 3: cpu = 73666 real = 73876 gc = 54037
run 4: cpu = 73236 real = 73502 gc = 53714
run 5: cpu = 72905 real = 73023 gc = 53377
run 6: cpu = -356852 real = 73172 gc = 53214
run 7: cpu = 72464 real = 72733 gc = 52865
run 8: cpu = 72053 real = 72363 gc = 52465
run 9: cpu = 71853 real = 72233 gc = 52205
CPU:   mean = -12770.300000000001, variance = 32720231896.233334
Real:  mean = 73492.9, variance = 914819.8777777779
GC:    mean = 53576.299999999996, variance = 823737.5666666668

Doug

> -----Original Message-----
> From: plt-scheme-bounces at list.cs.brown.edu [mailto:plt-scheme-
> bounces at list.cs.brown.edu] On Behalf Of Williams, M. Douglas
> Sent: Friday, February 10, 2006 11:12 AM
> To: Noel Welsh; Gregory Woodhouse
> Cc: PLT Scheme
> Subject: RE: [plt-scheme] V301.5 Speed Up
> 
> I've just be running a few tests.  It seems that time-apply sometimes
> returns a negative number for the CPU time.  That does cause havoc with
> the
> variance (and the mean).
> 
> Doug
> 
> 
> -----Original Message-----
> From: plt-scheme-bounces at list.cs.brown.edu on behalf of Noel Welsh
> Sent: Fri 2/10/2006 9:17 AM
> To: Gregory Woodhouse
> Cc: PLT Scheme
> Subject: Re: [plt-scheme] V301.5 Speed Up
> 
> --- Gregory Woodhouse <gregory.woodhouse at sbcglobal.net>
> wrote:
> 
> > It would be nice to be able run a test 1000 times,
> > saving the data for statistical analysis.
> 
> I've just written code to do this (run code 50 times,
> perform test for significance).  It requires a hacked
> version of the science collection so it won't work till the
> next version of the science collection is out.  If anyone
> wants it, email me off list.
> 
> Anyway, some observations:
> 
>   - GC time is really long compared to run time (for the
> silly little benchmarks I tried)
> 
>   - unexpectedly, the variance of my measurements was
> crazy!  When I made benchmarks (just loops adding up
> numbers) long enough to measure the time reliable I got
> results like this:
> 
> The code:
> 
>            (let* ((test1 (lambda ()
>                      (for ((i 0 10000) (sum 0))
>                            (+ 1000 sum))))
>              (test2 (lambda ()
>                       (for ((i 0 10000000) (sum 0))
>                            (+ 1 sum))))
>              (s1 (measure test1))
>              (s2 (measure test2)))
>         (let-values (((faster? p) (faster s1 s2)))
>           (printf "p ~a\n" p)
>           (printf "s1 mean: ~a var: ~a\n" (mean s1)
> (variance s1))
>           (printf "s2 mean: ~a var: ~a\n" (mean s2)
> (variance s2))
>           (assert-true faster?)))
> 
> The output:
> 
> p 1.0
> s1 mean: 6.799999999999996 var: 22.204081632653068
> s2 mean: 13492.2 var: 18376.693877551028
> 
> P is the value returned by the t-test (the probability the
> means differ by chance).  Incidentally the assumptions for
> the t-test are almost certainly violated in this case.
> 
> Anyway, I really can't explain the variance being that
> large.  Here's how I collect the data:
> 
>   ;; measure : ((any ...) -> any  any ...) -> (vector-of
> number)
>   (define (measure proc . args)
>     (define (prepare)
>       (for! (i 0 3)
>             (collect-garbage)))
>     (list->vector
>      (for ((i 0 50) (times null))
>           (prepare)
>           (let-values (((results cpu-time real-time
> gc-time)
>                         (time-apply proc args)))
>             (cons cpu-time times)))))
> 
> Hope that's of interest to someone!
> 
> N.
> 
> Email: noelwelsh <at> yahoo <dot> com   noel <at> untyped <dot> com
> AIM: noelhwelsh
> Blogs: http://monospaced.blogspot.com/  http://www.untyped.com/untyping/
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
> _________________________________________________
>   For list-related administrative tasks:
>   http://list.cs.brown.edu/mailman/listinfo/plt-scheme
> 
> _________________________________________________
>   For list-related administrative tasks:
>   http://list.cs.brown.edu/mailman/listinfo/plt-scheme


Posted on the users mailing list.