[racket-dev] A Const type constructor

From: Neil Toronto (neil.toronto at gmail.com)
Date: Wed Jul 25 17:37:39 EDT 2012

On 07/25/2012 10:31 AM, Neil Toronto wrote:
> On 07/25/2012 10:26 AM, Sam Tobin-Hochstadt wrote:
>> On Wed, Jul 25, 2012 at 9:29 AM, Neil Toronto <neil.toronto at gmail.com>
>> wrote:
>>> After thinking about it, I don't want an Immutable-Vector type, for
>>> which v
>>> : Immutable-Vector proves (immutable? v) is #t. That would be seriously
>>> annoying to users of a vector library.
>>>
>>> What if TR had a notion of const-ness, like in C? Suppose (Vectorof
>>> A) is a
>>> subtype of (Const-Vectorof B) when A is a subtype of B, and
>>> (Const-Vectorof
>>> A) is never a subtype of (Vectorof B).
>>
>> How exactly are these different?  An immutable vector is a vector, but
>> could be covariant, which seems like what you want. However, a mutable
>> vector can't be treated as an immutable vector.
>
> I don't want to tell Typed Racket "This vector's values never change." I
> want to tell it "Code in this scope never changes this vector's values."
> That's why I called it "const" instead of "immutable".

To elaborate: an Immutable-Vector type would have these problems:

  * It would put a conversion burden on users.
  * It would be slow.

Most vector-producing Racket functions produce mutable vectors. Users 
would have to convert them to immutable to get any decent use out of a 
math/vector library. I would want to convert them to immutable before 
returning them from functions, to decrease the burden on users. 
Conversion makes a copy:

   > (define x (vector 1 2 3))
   > (eq? x (vector->immutable-vector x))
   #f

I can roll my own Immutable-Vector-like type right now, but it has 
exactly the same problems as immutable vectors; i.e. it puts a 
conversion burden on users and it's slow.

I won't write a math/vector library until I can do better.

The homebrew Immutable-Vector-like thing is a good way to illustrate 
what I'm asking for in a Const type constructor, though, so here it is.

Suppose I want to write `vector=', which lifts `=' to operate pointwise 
on vectors using something similar to `vector-andmap'. Here's the type:

   (: vector= ((Vectorof Number) (Vectorof Number) -> Boolean))

Of course, this won't work out:

   (define xs (vector 1.0 2.0))
   (define ys (vector 1 2))
   ;(vector= xs ys)  ; Won't type

So I make a read-only "view" of vectors, with conversion functions:

   (struct: (A) const-vector ([length : Index] [fun : (Index -> A)]))

   (: vector->const-vector (All (A) ((Vectorof A) -> (const-vector A))))
   (: const-vector->vector (All (A) ((const-vector A) -> (Vectorof A))))

Now I can write a wrapper for `vector=':

   (: const-vector= ((const-vector Number) (const-vector Number)
                                           -> Boolean))
   (define (const-vector= v1 v2)
     (vector= (const-vector->vector v1) (const-vector->vector v2)))

I could use it like this:

   (define x (vector->const-vector xs))
   (define y (vector->const-vector ys))
   (const-vector= x y)  ; Types, and returns #t

THIS NEXT PART IS CRITICAL. *I don't care* whether someone alters the 
vector that backs a const-vector:

   (vector-set! xs 0 2.0)
   (const-vector= x y)  ; Returns #f now, but I don't care

The only thing I care about is having a fast, flat, covariant collection 
type. But this one is very slow. Timings for 100,000,000 refs on my 
system (run in command-line Racket):

   vector-ref               270ms
   unsafe-vector-ref        130ms
   const-vector-ref         1950ms
   unsafe-const-vector-ref  1350ms

About 10x slower overall. This is unacceptable for a math library.

Let's look at some other, worse options. First, huge case-> types:

   (: vector=
      (case-> ((Vectorof Integer) (Vectorof Integer) -> Boolean)
              ((Vectorof Float) (Vectorof Float) -> Boolean)
              ((Vectorof Float-Complex) (Vectorof Float-Complex)
                                        -> Boolean)
              ((Vectorof Exact-Number) (Vectorof Exact-Number)
                                       -> Boolean)
              ((Vectorof Exact-Rational) (Vectorof Exact-Rational)
                                         -> Boolean)
              ((Vectorof Real) (Vectorof Real) -> Boolean)
              ((Vectorof Number) (Vectorof Number) -> Boolean)))

The function body takes 7 times longer to typecheck than it would with a 
single function type. It would actually be better to do a cartesian 
product of type parameters (so I can test a (Vectorof Integer) against a 
(Vectorof Float)), which takes 49 times longer. Ouch.

A terrible option, which would be like not having a type system:

   (: vector= ((Vectorof Any) (Vectorof Any) -> Boolean))

Internally, `vector=' would use occurrence typing to make sure it 
compares only numbers.

A Const type like (Const (Vectorof A)) would have the same properties as 
a (const-vector A). Its inhabitants would be covariant, flat 
collections. However, it would be *fast* because the const-ness would be 
a property checked statically by the type system instead of hacked in 
using the covariance of function return types.

I've attached the `const-vector' definitions in case you want to 
double-check that I've made every effort to make const-vector-ref fast.

Neil ⊥

-------------- next part --------------
#lang typed/racket

(require racket/unsafe/ops)

;; Use instead of vector-ref to disable partial bounds checking elimination pessimization
(define :vector-ref vector-ref)

(: vector= ((Vectorof Number) (Vectorof Number) -> Boolean))
(define (vector= vs1 vs2)
  (define n (vector-length vs1))
  (cond [(= n (vector-length vs2))
         (let loop ([i 0])
           (cond [(i . < . n)  (and (= (vector-ref vs1 i) (vector-ref vs2 i))
                                    (loop (+ i 1)))]
                 [else  #t]))]
        [else  #f]))

(define xs (vector 1.0 2.0))
(define ys (vector 1 2))
;(vector= xs ys)  ; Won't type

;; So make a ref-only "view" struct for vectors; a vector is not mutable
;; through one of these (but is from anything that has a pointer to an original)
(struct: (A) const-vector ([length : Index] [fun : (Index -> A)]))

(: vector->const-vector (All (A) ((Vectorof A) -> (const-vector A))))
;; Convert from vector `vs' to a const-vector backed by `vs'
(define (vector->const-vector vs)
  (const-vector (vector-length vs) (λ: ([i : Index])
                                     ;; Will be bounds-checked by const-vector-ref
                                     (unsafe-vector-ref vs i))))

(: const-vector->vector (All (A) ((const-vector A) -> (Vectorof A))))
;; Convert from a const vector to a new vector
(define (const-vector->vector v)
  ;; Dunno why build-vector doesn't work here; it should according to its type
  (list->vector (build-list (const-vector-length v) (const-vector-fun v))))

(: const-vector= ((const-vector Number) (const-vector Number) -> Boolean))
;; Wrapper for vector=
(define (const-vector= v1 v2)
  (vector= (const-vector->vector v1) (const-vector->vector v2)))

(define x (vector->const-vector xs))
(define y (vector->const-vector ys))
(const-vector= x y)  ; Types, and returns #t

(vector-set! xs 0 2.0)
(const-vector= x y)  ; Returns #f now, but I don't care

(: unsafe-const-vector-ref (All (A) ((const-vector A) Index -> A)))
;; Unsafely returns element at i
(define (unsafe-const-vector-ref v i)
  ((const-vector-fun v) i))

(: const-vector-ref (All (A) ((const-vector A) Integer -> A)))
;; Returns element at i with bounds checking
(define (const-vector-ref v i)
  (cond [(and (0 . <= . i) (i . < . (const-vector-length v)))
         (unsafe-const-vector-ref v i)]
        [else  (raise-type-error 'const-vector-ref
                                 (format "Index < ~a" (const-vector-length v))
                                 1 v i)]))

;; Timing tests for 100,000,000 refs

(printf "100000000 vector-ref:~n")
;; Averages 270ms
(for ([_  (in-range 5)])
  (time (for ([_  (in-range 100000000)])
          (:vector-ref xs 0))))

(printf "100000000 unsafe-vector-ref:~n")
;; Averages 130ms
(for ([_  (in-range 5)])
  (time (for ([_  (in-range 100000000)])
          (unsafe-vector-ref xs 0))))

(printf "100000000 const-vector-ref:~n")
;; Averages 1950ms
(for ([_  (in-range 5)])
  (time (for ([_  (in-range 100000000)])
          (const-vector-ref x 0))))

(printf "100000000 unsafe-vector-ref:~n")
;; Averages 1350ms
(for ([_  (in-range 5)])
  (time (for ([_  (in-range 100000000)])
          (unsafe-const-vector-ref x 0))))

Posted on the dev mailing list.