[racket] (define best (compose car sort))

From: Gustavo Massaccesi (gustavo at oma.org.ar)
Date: Sat Jan 18 17:50:05 EST 2014

I would have sworn that adding all the elements to the heap was an O(n
ln(n)) operation. I searched and you are right, it’s a O(n) operation.
I don’t know if the Racket implementation use the O(n) algorithm. I
looked at the source, but it’s long. I guess that the answer is yes.

I ran a few tests to compare the speed of different methods. In spite
of the asymptotic properties, for numbers like n=1000000 the heap
method is 10 times slower than the sort method. I run this a few time
and the results are consistent, but a little noisy. (See code below.)

*****Racket, version 5.3.6
(heap 1000000 0)cpu time: 14617 real time: 14734 gc time: 1466
(sort 1000000 0)cpu time: 1560 real time: 1554 gc time: 0
(for 1000000 0)cpu time: 125 real time: 130 gc time: 0
(loop2 1000000 0)cpu time: 141 real time: 140 gc time: 0
(loop 1000000 0)cpu time: 31 real time: 20 gc time: 0

(heap 10000000 0)cpu time: 225781 real time: 230849 gc time: 86071
(sort 10000000 0)cpu time: 21590 real time: 23751 gc time: 3259
(for 10000000 0)cpu time: 1311 real time: 1334 gc time: 0
(loop2 10000000 0)cpu time: 1326 real time: 1364 gc time: 0
(loop 10000000 0)cpu time: 202 real time: 200 gc time: 0


*****Racket, version 6.0.0.1, a few days old
(heap 1000000 0)cpu time: 22027 real time: 22446 gc time: 5537
(sort 1000000 0)cpu time: 1701 real time: 1802 gc time: 234
(for 1000000 0)cpu time: 109 real time: 110 gc time: 0
(loop2 1000000 0)cpu time: 125 real time: 120 gc time: 0
(loop 1000000 0)cpu time: 16 real time: 20 gc time: 0

(heap 10000000 0)cpu time: 275295 real time: 278819 gc time: 115127
(sort 10000000 0)cpu time: 20234 real time: 20562 gc time: 3292
(for 10000000 0)cpu time: 1185 real time: 1262 gc time: 0
(loop2 10000000 0)cpu time: 1248 real time: 1250 gc time: 0
(loop 10000000 0)cpu time: 219 real time: 212 gc time: 0

Gustavo

;------------------------------
#lang racket
(require data/heap)
(define (best/sort l <)
  (car (sort l <)))
(define (best/heap l <)
  (define h (make-heap <))
  (heap-add-all! h l)
  (heap-min h))
(define (best/loop l <)
  (let loop ([min (car l)] [l (cdr l)])
    (if (null? l)
        min
        (let ([next (car l)])
          (if (< next min)
              (loop next (cdr l))
              (loop min (cdr l)))))))
(define (best/loop2 l <)
  (let loop ([min (car l)] [l (cdr l)])
    (if (null? l)
        min
        (let ([next (car l)])
          (loop (if (< next min) next min) (cdr l))))))
(define (best/for l <)
  (for/fold ([min (car l)]) ([next (in-list (cdr l))])
    (if (< next min) next min)))

(define (best-compare n)
  (let ([l (shuffle (build-list n values))])
    (time (display (list 'heap n (best/heap l <))))
    (time (display (list 'sort n (best/sort l <))))
    (time (display (list 'for n (best/for l <))))
    (time (display (list 'loop2 n (best/loop2 l <))))
    (time (display (list 'loop n (best/loop l <))))
    (newline)
    ))
(best-compare  100)
(best-compare  1000)
(best-compare  10000)
(best-compare  100000)
(best-compare  1000000)
(best-compare  10000000)
;------------------------------


On Sat, Jan 18, 2014 at 4:43 PM, Jens Axel Søgaard
<jensaxel at soegaard.net> wrote:
> Hi Gustavo,
>
> An efficient solution for large data sets is to use heaps:
>
> #lang racket
> (require data/heap)
>
> (define data '((3 "three") (4 "four") (2 "two") (1 "one")))
>
> (define h (make-heap (λ(x y) (<= (first x) (first y)))))
> (heap-add-all! h data)
> (heap-min h)
>
> /Jens Axel
>
>
> 2014/1/18 Gustavo Massaccesi <gustavo at oma.org.ar>:
>> I extended one of the Rosetta code task: Levenshtein
>> distance/Alignment:
>> http://rosettacode.org/wiki/Levenshtein_distance/Alignment to show the
>> alignment.
>>
>> I needed a function that finds the “best” item in a list. It’s like
>> sort but it only finds the first one:
>>
>> #lang racket
>> (define best (compose car sort))
>>
>> (define data '((3 "three") (4 "four") (2 "two") (1 "one"))
>>
>> (best data < #:key car) ;==> (1 "one")
>> (best data > #:key car) ;==> (4 "four")
>>
>> (best data string<? #:key cadr) ;==> (4 "four")
>> (best data string>? #:key cadr) ;==> (2 "two")
>>
>> It’s not very efficient, but I needed it only for n=3.
>>
>> I could have written an efficient version by hand, but I it was only
>> an auxiliary function and I wanted to keep the task as short as
>> possible.
>>
>> Did anyone use a similar function in another program? Can the
>> efficient version be added to the standard library? (Is it a good idea
>> to add yet another function to the standard library?)
>>
>> Gustavo
>>
>> ____________________
>>   Racket Users list:
>>   http://lists.racket-lang.org/users
>
>
>
> --
> --
> Jens Axel Søgaard


Posted on the users mailing list.