[plt-scheme] Getting started with Scheme: How can I improve my code for generating prime numbers?

From: Matthias Felleisen (matthias at ccs.neu.edu)
Date: Sun Aug 2 15:21:39 EDT 2009

Thanks :-)  I had just gotten yeah far:

(define (primes.v1 n)
   (define len (quotient (- (+ n 1) 3) 2))
   (define sieve (make-vector len #t))
   (define root (sqrt n))
   sieve)

:-)


On Aug 2, 2009, at 3:18 PM, Eli Barzilay wrote:

> On Aug  1, SiWi wrote:
>> Hello PLT Scheme community,
>> this is my first post here, so please excuse any mistakes.  I'm 16
>> years old and so far my programming experience has mainly been in
>> Python.
>> To learn about functional programming, I just started with Scheme.
>> For training I tried to convert some prime number generator code
>> from Python to Scheme.
>> It's using a sieve algorithm.
>> The problem is that my Scheme code is about 1000-10000 times slower
>> than my Python code.
>> I would greatly appreciate it if you could give me some hints on how
>> to improve my code.
>
> You have several problems in your code, which are a result of
> translating the Python code into a more Scheme-ish version.
>
> A few things that I've noticed can all be seen in this expression that
> you wrote:
>
>   (append final (list (+ (* 2 i) 3)))
>
> * "Python lists" are really not the same as "Scheme lists".  In
>   Scheme, this refers to a linked list, where each item is a "cons
>   cell" which holds a value and a pointer to the next item in the
>   list.  This makes certain operations much faster.  For example,
>   adding a value at the front of an existing list is a constant time
>   operation (since it has a pointer to the input list).  It is also
>   very natural for a whole bunch of problems that programmers run
>   into, and it is easy to iterate over directly.
>
>   In Python, the lists are really more like arrays.  They are faster
>   for things like random-access of some value (you just look at the
>   given offset in the array), but if you want to add an item at the
>   front of the "list", you need to move everything down.  It is also
>   inconvenient to iterate on (you can't do it directly, you need to
>   use an index as a counter), and it is natural in cases where you
>   know that you need a chunk of values, usually of some fixed size
>   (some languages allow these arrays to vary dynamically, or even be
>   able to use any value for an index).
>
> * In your python code, you just create one such vector, then change,
>   and finally collect the results into the output.  In Scheme, you
>   keep creating new lists -- which is very expensive.  Think about it
>   this way: your `list-bool' function takes the current sieve as an
>   input, *copies* it to a new one while changing some bits to true.
>   This is obviously much slower.
>
> * Finally, I didn't talk about adding an item at the end of a list.
>   With lists, this (append some-list (list new-value)) is very
>   expensive because you need to copy the whole list to have a
>   different last value.  If you do this 10000 times, then each of
>   these steps creates a new copy of the list so far, which is a lot of
>   work for the GC.  But this was not really necessary -- looking at
>   your python code, you only add an item at the front of the list.
>
> Just to measure the speed difference, I added this line to your code:
>
>   print(len(primes(100000000)))
>
> And then I did a direct translation of your code to PLT Scheme:
>
>   #lang scheme/base
>   (define (primes n)
>     (define root  (expt n 0.5))
>     (define len   (/ (- (+ n 1) 3) 2))
>     (define sieve (make-vector len #t))
>     (for ([i (in-range len)])
>       (define x (+ (* i 2) 3))
>       (unless (> x root)
>         (when (vector-ref sieve i)
>           (for ([j (in-range (+ (/ (- (* x x) x) 2) i) len x)])
>             (vector-set! sieve j #f)))))
>     (cons 2 (for/list ([x (in-range len)] #:when (vector-ref sieve x))
>               (+ (* x 2) 3))))
>   (length (primes 100000000))
>
> This might be fitting on some cases, but it's mostly not an example of
> typical Scheme code.  Anyway, the timings I'm getting on my machine
> are
>
>   python:   28.64s user 2.94s system 99% cpu 31.891 total
>   mzscheme: 10.03s user 0.53s system 99% cpu 10.562 total
>
> --  
>           ((lambda (x) (x x)) (lambda (x) (x x)))          Eli  
> Barzilay:
>                     http://barzilay.org/                   Maze is  
> Life!
> _________________________________________________
>   For list-related administrative tasks:
>   http://list.cs.brown.edu/mailman/listinfo/plt-scheme



Posted on the users mailing list.