[plt-scheme] Getting started with Scheme: How can I improve my code for generating prime numbers?
On Aug 1, SiWi wrote:
> Hello PLT Scheme community,
> this is my first post here, so please excuse any mistakes. I'm 16
> years old and so far my programming experience has mainly been in
> Python.
> To learn about functional programming, I just started with Scheme.
> For training I tried to convert some prime number generator code
> from Python to Scheme.
> It's using a sieve algorithm.
> The problem is that my Scheme code is about 1000-10000 times slower
> than my Python code.
> I would greatly appreciate it if you could give me some hints on how
> to improve my code.
You have several problems in your code, which are a result of
translating the Python code into a more Scheme-ish version.
A few things that I've noticed can all be seen in this expression that
you wrote:
(append final (list (+ (* 2 i) 3)))
* "Python lists" are really not the same as "Scheme lists". In
Scheme, this refers to a linked list, where each item is a "cons
cell" which holds a value and a pointer to the next item in the
list. This makes certain operations much faster. For example,
adding a value at the front of an existing list is a constant time
operation (since it has a pointer to the input list). It is also
very natural for a whole bunch of problems that programmers run
into, and it is easy to iterate over directly.
In Python, the lists are really more like arrays. They are faster
for things like random-access of some value (you just look at the
given offset in the array), but if you want to add an item at the
front of the "list", you need to move everything down. It is also
inconvenient to iterate on (you can't do it directly, you need to
use an index as a counter), and it is natural in cases where you
know that you need a chunk of values, usually of some fixed size
(some languages allow these arrays to vary dynamically, or even be
able to use any value for an index).
* In your python code, you just create one such vector, then change,
and finally collect the results into the output. In Scheme, you
keep creating new lists -- which is very expensive. Think about it
this way: your `list-bool' function takes the current sieve as an
input, *copies* it to a new one while changing some bits to true.
This is obviously much slower.
* Finally, I didn't talk about adding an item at the end of a list.
With lists, this (append some-list (list new-value)) is very
expensive because you need to copy the whole list to have a
different last value. If you do this 10000 times, then each of
these steps creates a new copy of the list so far, which is a lot of
work for the GC. But this was not really necessary -- looking at
your python code, you only add an item at the front of the list.
Just to measure the speed difference, I added this line to your code:
print(len(primes(100000000)))
And then I did a direct translation of your code to PLT Scheme:
#lang scheme/base
(define (primes n)
(define root (expt n 0.5))
(define len (/ (- (+ n 1) 3) 2))
(define sieve (make-vector len #t))
(for ([i (in-range len)])
(define x (+ (* i 2) 3))
(unless (> x root)
(when (vector-ref sieve i)
(for ([j (in-range (+ (/ (- (* x x) x) 2) i) len x)])
(vector-set! sieve j #f)))))
(cons 2 (for/list ([x (in-range len)] #:when (vector-ref sieve x))
(+ (* x 2) 3))))
(length (primes 100000000))
This might be fitting on some cases, but it's mostly not an example of
typical Scheme code. Anyway, the timings I'm getting on my machine
are
python: 28.64s user 2.94s system 99% cpu 31.891 total
mzscheme: 10.03s user 0.53s system 99% cpu 10.562 total
--
((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay:
http://barzilay.org/ Maze is Life!