[racket] ranking items by their appearance in sets

From: Matthias Felleisen (matthias at ccs.neu.edu)
Date: Wed Sep 24 10:44:45 EDT 2014

On Sep 24, 2014, at 12:29 AM, David T. Pierson <dtp at mindstory.com> wrote:

> I'm looking for criticism in any aspect,

Style: Here is my rewrite with style suggestions. I liked to have a single point that tests all versions of a function when I conduct such design experiments. I also want to have all my test code in the test submodule so that it doesn't get deployed. That's why I am using a syntactic abstraction that unfolds into a submodule rather than a function that calls things. 

Performance: I have not checked their performance but I cannot imagine the first version to be performant. See my annotations. 

My choice: Your second version is what I would have written. 

#lang racket

(module+ test
  (require rackunit))

;; ---------------------------------------------------------------------------------------------------
;; [Listof [Setof Symbol]] -> [Listof [Cons Symbol Natural]]
;; rank the elements of the given sets based on how many sets they appear in

;; version 1
(define (rank-set-elements/1 list-of-sets)               ;; n is number of symbols in sets 
  (define list-of-lists    (map set->list list-of-sets)) ;; O(n)
  (define flat-list        (flatten list-of-lists))      ;; allocates a lot  
  ;; I would have fused the next two lines and possibly all four 
  (define sorted-flat-list (sort flat-list symbol<?))    ;; O(n^2)? 
  (define list-of-item-frequency-pairs                   ;; critically relies on (sorted v)
    (for/fold ((accum '())) ((v sorted-flat-list))
      (match accum
        ;; this matches until the next, distinct symbol shows up 
        [(list-rest (cons sym cnt) rst)
         (if (eq? sym v)
             (cons (cons sym (add1 cnt)) rst)
             (cons (cons v 1) accum))]
        [else (list (cons v 1))])))
  (sort list-of-item-frequency-pairs > #:key cdr))

;; version 2 [[ that's the version I would have written ]]
(define (rank-set-elements/2 list-of-sets)
  (define hash-of-item-frequencies
    (for*/fold ([frequencies (hash)]) ([a-set list-of-sets] [sym a-set])
      (hash-update frequencies sym add1 0)))
  ;; I would fuse the next two lines 
  (define list-of-item-frequency-pairs (hash->list hash-of-item-frequencies))
  (sort list-of-item-frequency-pairs > #:key cdr))

;; unit tests <
(define-syntax-rule 
  (test rank)
  ;; => 
  (module+ test 
    (define input
      (shuffle
       (map list->set
            '((a)
              (a b)
              (a b c)
              (a b c d)
              (a b c d e)))))
    
    (define expected
      '((a . 5)
        (b . 4)
        (c . 3)
        (d . 2)
        (e . 1)))
    
    (check-equal? (rank '()) '())
    (check-equal? (rank input) expected)))

(test rank-set-elements/1)
(test rank-set-elements/2)



Posted on the users mailing list.