[racket] math/matrix

From: Eric Dobson (eric.n.dobson at gmail.com)
Date: Sun May 11 17:26:45 EDT 2014

Where is the time spent in the algorithm? I assume that most of it is
in the matrix manipulation work not the orchestration of finding a
pivot and reducing based on that. I.e. `elim-rows!` is the expensive
part. Given that you only specialized the outer part of the loop, I
wouldn't expect huge performance changes.

On Sun, May 11, 2014 at 2:13 PM, Jens Axel Søgaard
<jensaxel at soegaard.net> wrote:
> I tried restricting the matrix-solve and matrix-gauss-elim to (Matrix Flonum).
> I can't observe a change in the timings.
>
> #lang typed/racket
> (require math/matrix
>          math/array
>          math/private/matrix/utils
>          math/private/vector/vector-mutate
>          math/private/unsafe
>          (only-in racket/unsafe/ops unsafe-fl/)
>          racket/fixnum
>          racket/list)
>
> (define-type Pivoting (U 'first 'partial))
>
> (: flonum-matrix-gauss-elim
>    (case-> ((Matrix Flonum) -> (Values (Matrix Flonum) (Listof Index)))
>            ((Matrix Flonum) Any -> (Values (Matrix Flonum) (Listof Index)))
>            ((Matrix Flonum) Any Any -> (Values (Matrix Flonum) (Listof Index)))
>            ((Matrix Flonum) Any Any Pivoting -> (Values (Matrix
> Flonum) (Listof Index)))))
> (define (flonum-matrix-gauss-elim M [jordan? #f] [unitize-pivot? #f]
> [pivoting 'partial])
>   (define-values (m n) (matrix-shape M))
>   (define rows (matrix->vector* M))
>   (let loop ([#{i : Nonnegative-Fixnum} 0]
>              [#{j : Nonnegative-Fixnum} 0]
>              [#{without-pivot : (Listof Index)} empty])
>     (cond
>       [(j . fx>= . n)
>        (values (vector*->matrix rows)
>                (reverse without-pivot))]
>       [(i . fx>= . m)
>        (values (vector*->matrix rows)
>                ;; None of the rest of the columns can have pivots
>                (let loop ([#{j : Nonnegative-Fixnum} j] [without-pivot
> without-pivot])
>                  (cond [(j . fx< . n)  (loop (fx+ j 1) (cons j without-pivot))]
>                        [else  (reverse without-pivot)])))]
>       [else
>        (define-values (p pivot)
>          (case pivoting
>            [(partial)  (find-partial-pivot rows m i j)]
>            [(first)    (find-first-pivot rows m i j)]))
>        (cond
>          [(zero? pivot)  (loop i (fx+ j 1) (cons j without-pivot))]
>          [else
>           ;; Swap pivot row with current
>           (vector-swap! rows i p)
>           ;; Possibly unitize the new current row
>           (let ([pivot  (if unitize-pivot?
>                             (begin (vector-scale! (unsafe-vector-ref rows i)
>                                                   (unsafe-fl/ 1. pivot))
>                                    (unsafe-fl/ pivot pivot))
>                             pivot)])
>             (elim-rows! rows m i j pivot (if jordan? 0 (fx+ i 1)))
>             (loop (fx+ i 1) (fx+ j 1) without-pivot))])])))
>
> (: flonum-matrix-solve
>    (All (A) (case->
>              ((Matrix Flonum) (Matrix Flonum)        -> (Matrix Flonum))
>              ((Matrix Flonum) (Matrix Flonum) (-> A) -> (U A (Matrix
> Flonum))))))
> (define flonum-matrix-solve
>   (case-lambda
>     [(M B)  (flonum-matrix-solve
>              M B (λ () (raise-argument-error 'flonum-matrix-solve
> "matrix-invertible?" 0 M B)))]
>     [(M B fail)
>      (define m (square-matrix-size M))
>      (define-values (s t) (matrix-shape B))
>      (cond [(= m s)
>             (define-values (IX wps)
>               (parameterize ([array-strictness #f])
>                 (flonum-matrix-gauss-elim (matrix-augment (list M B)) #t #t)))
>             (cond [(and (not (empty? wps)) (= (first wps) m))
>                    (submatrix IX (::) (:: m #f))]
>                   [else  (fail)])]
>            [else
>             (error 'flonum-matrix-solve
>                    "matrices must have the same number of rows; given ~e and ~e"
>                    M B)])]))
>
> (: mx Index)
> (define mx 600)
>
> (: r (Index Index -> Flonum))
> (define (r i j) (random))
>
> (: A : (Matrix Flonum))
> (define A (build-matrix mx mx r))
>
> (: sum : Integer Integer -> Flonum)
> (define (sum i n)
>   (let loop ((j 0) (acc 0.0))
>     (if (>= j mx) acc
>         (loop (+ j 1) (+ acc (matrix-ref A i j))) )))
>
> (: b : (Matrix Flonum))
> (define b (build-matrix mx 1 sum))
>
> (time
>  (let [(m (flonum-matrix-solve A b))]
>   (matrix-ref m 0 0)))
> (time
> (let [(m (matrix-solve A b))]
>   (matrix-ref m 0 0)))
>
> (time
>  (let [(m (flonum-matrix-solve A b))]
>   (matrix-ref m 0 0)))
> (time
> (let [(m (matrix-solve A b))]
>   (matrix-ref m 0 0)))
>
> (time
>  (let [(m (flonum-matrix-solve A b))]
>   (matrix-ref m 0 0)))
> (time
> (let [(m (matrix-solve A b))]
>   (matrix-ref m 0 0)))
>
> 2014-05-11 21:48 GMT+02:00 Neil Toronto <neil.toronto at gmail.com>:
>> The garbage collection time is probably from cleaning up boxed flonums, and
>> possibly intermediate vectors. If so, a separate implementation of Gaussian
>> elimination for the FlArray type would cut the GC time to nearly zero.
>>
>> Neil ⊥
>>
>>
>> On 05/11/2014 01:36 PM, Jens Axel Søgaard wrote:
>>>
>>> Or ... you could take a look at
>>>
>>>
>>> https://github.com/plt/racket/blob/master/pkgs/math-pkgs/math-lib/math/private/matrix/matrix-gauss-elim.rkt
>>>
>>> at see if something can be improved.
>>>
>>> /Jens Axel
>>>
>>>
>>> 2014-05-11 21:30 GMT+02:00 Jens Axel Søgaard <jensaxel at soegaard.net>:
>>>>
>>>> Hi Eduardo,
>>>>
>>>> The math/matrix library uses the arrays from math/array to represent
>>>> matrices.
>>>>
>>>> If you want to try the same representation as Bigloo, you could try Will
>>>> Farr's
>>>> matrix library:
>>>>
>>>>
>>>> http://planet.racket-lang.org/package-source/wmfarr/simple-matrix.plt/1/1/planet-docs/simple-matrix/index.html
>>>>
>>>> I am interested in hearing the results.
>>>>
>>>> /Jens Axel
>>>>
>>>>
>>>>
>>>> 2014-05-11 21:18 GMT+02:00 Eduardo Costa <edu500ac at gmail.com>:
>>>>>
>>>>> What is bothering me is the time Racket is spending in garbage
>>>>> collection.
>>>>>
>>>>> ~/wrk/scm/rkt/matrix$ racket matrix.rkt
>>>>> 0.9999999999967226
>>>>> cpu time: 61416 real time: 61214 gc time: 32164
>>>>>
>>>>> If I am reading the output correctly, Racket is spending 32 seconds out
>>>>> of
>>>>> 61 seconds in garbage collection.
>>>>>
>>>>>   I am following Junia Magellan's computer language comparison and I
>>>>> cannot
>>>>> understand why Racket needs the garbage collector for doing Gaussian
>>>>> elimination. In a slow Compaq/HP machine, solving a system of 800 linear
>>>>> equations takes 17.3 seconds in Bigloo, but requires 58 seconds in
>>>>> Racket,
>>>>> even after removing the building of the linear system from
>>>>> consideration.
>>>>> Common Lisp is also much faster than Racket in processing arrays. I
>>>>> would
>>>>> like to point out that Racket is very fast in general. The only occasion
>>>>> that it lags badly behind Common Lisp and Bigloo is when one needs to
>>>>> deal
>>>>> with arrays.
>>>>>
>>>>> Basically, Junia is using Rasch method to measure certain latent traits
>>>>> of
>>>>> computer languages, like productivity and coaching time. In any case,
>>>>> she
>>>>> needs to do a lot of matrix calculations to invert the Rasch model.
>>>>> Since
>>>>> Bigloo works with homogeneous vectors, she wrote a few macros to access
>>>>> the
>>>>> elements of a matrix:
>>>>>
>>>>> (define (mkv n) (make-f64vector n))
>>>>> (define $ f64vector-ref)
>>>>> (define $! f64vector-set!)
>>>>> (define len f64vector-length)
>>>>>
>>>>> (define-syntax $$
>>>>>     (syntax-rules ()
>>>>>        (($$ m i j) (f64vector-ref (vector-ref m i) j))))
>>>>>
>>>>> (define-syntax $$!
>>>>>     (syntax-rules ()
>>>>>        (($$! matrix row column value)
>>>>>         ($! (vector-ref matrix row) column value))))
>>>>>
>>>>> I wonder whether homogeneous vectors would speed up Racket. In the same
>>>>> computer that Racket takes 80 seconds to build and invert a system of
>>>>> equations, Bigloo takes 17.3 seconds, as I told before. Common Lisp is
>>>>> even
>>>>> faster. However, if one subtracts the gc time from Racket's total time,
>>>>> the
>>>>> result comes quite close to Common Lisp or Bigloo.
>>>>>
>>>>> ~/wrk/bgl$ bigloo -Obench bigmat.scm -o big
>>>>> ~/wrk/bgl$ time ./big
>>>>> 0.9999999999965746 1.000000000000774 0.9999999999993039
>>>>> 0.9999999999982576
>>>>> 1.000000000007648 0.999999999996588
>>>>>
>>>>> real    0m17.423s
>>>>> user    0m17.384s
>>>>> sys    0m0.032s
>>>>> ~/wrk/bgl$
>>>>>
>>>>> Well, bigloo may perform global optimizations, but Common Lisp doesn't.
>>>>> When
>>>>> one is not dealing with matrices, Racket is faster than Common Lisp. I
>>>>> hope
>>>>> you can tell me how to rewrite the program in order to avoid garbage
>>>>> collection.
>>>>>
>>>>> By the way, you may want to know why not use Bigloo or Common Lisp to
>>>>> invert
>>>>> the Rasch model. The problem is that Junia and her co-workers are using
>>>>> hosting services that do not give access to the server or to the
>>>>> jailshell.
>>>>> Since Bigloo requires gcc based compilation, Junia discarded it right
>>>>> away.
>>>>> Not long ago, the hosting service stopped responding to the sbcl Common
>>>>> Lisp
>>>>> compiler for reasons that I cannot fathom.  Although Racket 6.0 stopped
>>>>> working too, Racket 6.0.1 is working fine. This left Junia, her
>>>>> co-workers
>>>>> and students with Racket as their sole option. As for myself, I am just
>>>>> curious.
>>>>>
>>>>>
>>>>> 2014-05-11 6:23 GMT-03:00 Jens Axel Søgaard <jensaxel at soegaard.net>:
>>>>>
>>>>>> 2014-05-11 6:09 GMT+02:00 Eduardo Costa <edu500ac at gmail.com>:
>>>>>>>
>>>>>>> The documentation says that one should expect typed/racket to be
>>>>>>> faster
>>>>>>> than
>>>>>>> racket. I tested the math/matrix library and it seems to be almost as
>>>>>>> slow
>>>>>>> in typed/racket as in racket.
>>>>>>
>>>>>>
>>>>>> What was (is?) slow was a call in an untyped module A to a function
>>>>>> exported
>>>>>> from a typed module B. The functions in B must check at runtime that
>>>>>> the values coming from A are of the correct type. If the A was written
>>>>>> in Typed Racket, the types would be known at compile time.
>>>>>>
>>>>>> Here math/matrix is written in Typed Racket, so if you are writing an
>>>>>> untyped module, you will in general want to minimize the use of,say,
>>>>>> maxtrix-ref. Instead operations that works on entire matrices or
>>>>>> row/columns are preferred.
>>>>>>
>>>>>>> (: sum : Integer Integer -> Flonum)
>>>>>>> (define (sum i n)
>>>>>>>    (let loop ((j 0) (acc 0.0))
>>>>>>>      (if (>= j mx) acc
>>>>>>>          (loop (+ j 1) (+ acc (matrix-ref A i j))) )))
>>>>>>>
>>>>>>> (: b : (Matrix Flonum))
>>>>>>> (define b (build-matrix mx 1 sum))
>>>>>>
>>>>>>
>>>>>> The matrix b contains the sums of each row in the matrix.
>>>>>> Since matrices are a subset of arrays, you can use array-axis-sum,
>>>>>> which computes sum along a given axis (i.e. a row or a column when
>>>>>> speaking of matrices).
>>>>>>
>>>>>> (define A (matrix [[0. 1. 2.]
>>>>>>                     [3. 4. 5.]
>>>>>>                     [6. 7. 8.]]))
>>>>>>
>>>>>>> (array-axis-sum A 1)
>>>>>>
>>>>>> - : (Array Flonum)
>>>>>> (array #[3.0 12.0 21.0])
>>>>>>
>>>>>> However as Eric points out, matrix-solve is an O(n^3) algorithm,
>>>>>> so the majority of the time is spent in matrix-solve.
>>>>>>
>>>>>> Apart from finding a way to exploit the relationship between your
>>>>>> matrix A and the column vector b, I see no obvious way of
>>>>>> speeding up the code.
>>>>>>
>>>>>> Note that when you benchmark with
>>>>>>
>>>>>>     time racket matrix.rkt
>>>>>>
>>>>>> you will include startup and compilation time.
>>>>>> Therefore if you want to time the matrix code,
>>>>>> insert a literal (time ...) call.
>>>>>>
>>>>>> --
>>>>>> Jens Axel Søgaard
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> --
>>>> Jens Axel Søgaard
>>>
>>>
>>>
>>>
>>
>> ____________________
>>  Racket Users list:
>>  http://lists.racket-lang.org/users
>
>
>
> --
> --
> Jens Axel Søgaard
>
> ____________________
>   Racket Users list:
>   http://lists.racket-lang.org/users


Posted on the users mailing list.