[racket] TR define-new-type, Value Types

From: Ray Racine (ray.racine at gmail.com)
Date: Fri Dec 28 20:35:58 EST 2012

Based on a google of a paper on refinement types in SML, no.  Useful in
their own right, and thanks for pointing out their
(experimental) existence.  I'm thinking more along the line of a quick win
of a weak sort of Dependent Type for value types such as Number and String
by leveraging existent Racket machinery.

See
https://docs.google.com/document/d/10TQKgMiJTbVtkdRG53wsLYwWM2MkhtmdV25-NZvLLMA/edit
for
a Scala endeavor along similar lines.

The goal is to support efficient generative, constrained, sub-types of
primitive value types, specifically String and Number with minimal surgery
to Racket.

Consider:

(define-value-type Age : Integer [1 .. 120])
(define-value-type SSN : String [1 .. 9 | 11] ssn-validation-predicate)

Goals:

1) Avoid boxing/unboxing (struct cell / cell-refs).
2) Create generative sub-types of certain base types, String, Number to
satisfy TR.  Note they are not opaque types but, i.e.  T <: String

Item 2) means

;; works as Ages are Integers
(: add-ages (Age Age -> Age))
(define (sum-ages a1 a2)
  (Age (+ a1 a2)) ;; + defined on Integers

;; not the same as (define-type Age Integer) because ...
(sum-ages 12 16) ;; fails as Integers are not Ages

;; lifting Integer to Age involves a runtime contract check, but no
allocation.
(sum-ages (Age 12) (Age 16)) ;; fine, no allocation

A hand waved way of getting there, which got ugly quick and as I typed.   I
was sort of brainstorming if I could get Value Types without any Racket
internal surgery and with no more than a bit of macrology.

So waving both hands wildly...

0) Modify the TR `cast' operator to recognize Value Type structures.

 a) The generated contract from the `cast' operator of a value type to an
appropriate Value Type structure succeeds at runtime for an instance of the
value type.
 b) The generated contract from the `cast' operator applies the
gen:validator on the Value Type structure as part of the contract.

1) Extend the struct: macro to allow a struct: parent to be not only
another struct: but a  [struct: | String | Number]

  a) IF the parent is a struct: nothing new to do here.

  b) If parent is a value type, String or a Number (value type)
    - This is a Value Type structure.
    - A value type structure has only one mandatory value which is of the
same type as the parent.
    - A Value Type structure is -sealed- and may not be used as the parent
in another struct: definition.
    - A Value Type structure's constructor is a (A -> A) pass-thru of the
value.  i.e., the struct: is never allocated to wrap the value.
    - A Value Type structure _may_ have a gen:validate generic method
associated with it.

4) To avoid creating a true Value Type structure instance via low level
apis, they would need to be modified to prohibit creating any instance of a
Value Type structure.

What we are trying to achieve is all of the type checking from TR using
struct: to generate a new type at compile time, yet at runtime the instance
values of the Value Type are the primitive values and are NOT manifested as
the struct: instances.

Example:
Create an SSN Value Type.

;; An SSN is String value, of length 9 or 11, which is validated by a
regular expression.

(: ssn-validation-predicate (String -> Boolean : SSN))
(define (ssn-validation-predicate str-ssn)
  (regexp-match? ssn #rx(....)))

(define-value-type SSN String [9 | 11] ssn-validation-predicate) ;;

The above roughly expands to:

(struct: SSN String ([value : String])
   #:methods gen:validate ssn-validation-predicate)

(define SSN-validator-contract (and/contract ....))  ;;; combines
string-length 9|11 check with ssn-validation-predicate into a contract

The struct: macro notes that this is a Value Type structure definition
because its parent is a value type, String, and not another struct:.  So
the generated SSN constructor function avoids creating an actual structure
at runtime and allows a string value as successfully cast to a SSN after
applying any associated validation contract.

(: SSN (String -> SSN))
(define (SSN ssn)
  (cast ssn SSN))

In the above ...
 - `cast' knows we are casting to a Value Type, SSN, so generated runtime
contract allows a String value (and _not_ a SSN struct type instance) to be
"passed-thru" but lifted to type SSN for TR purposes.
 - Therefore, the cast fails an actual instance of an SSN structure, if one
somehow managed to construct an instance.
 - As part of the `cast' generated contract the SSN-validator-contract and
length checks are combined and applied interstitial in the pass-thru of
(String -> String).

And finally since SSN at runtime is a string value, at compile time it's a
subtype of String so...

(substring (SSN "123-45-6789") 0 3) ;; works at TR compile time checking
and at runtime running
(substring (SSN "123x456-5689") 0 3) ;; fails validation at runtime, though
a sufficiently smart compiler would apply the contract validation check at
compile time for values known at compile time.

Given:

(: parse-ssn (SSN -> (Listof String)))
(define (parse-ssn ssn)
   (regexp-split ssn #rx"-"))

(parse-ssn "123-456-6789") ;; nope as strings are not SSNs
(parse-ssn (SSN "123-456-6789")) ;; works but runtime representation
remained as a string value, no struct: box/unbox.






On Fri, Dec 28, 2012 at 5:58 PM, Eric Dobson <eric.n.dobson at gmail.com>wrote:

> Do refinement types work for what you want?
>
>
> http://docs.racket-lang.org/ts-reference/Experimental_Features.html?q=refinement#(form._((lib._typed-racket/base-env/prims..rkt)._declare-refinement))
>
> #lang typed/racket
> (declare-refinement even?)
> (: two (Refinement even?))
> (define two
>   (let ((x 2))
>     (if (even? x) x (error 'bad))))
>
> There are a couple of issues with them, mostly that they are not sound
> when dealing with mutable objects or non pure functions, which allows you
> to break the soundness of TR.
> http://bugs.racket-lang.org/query/?cmd=view+audit-trail&pr=13059
>
>
> On Fri, Dec 28, 2012 at 2:17 PM, Ray Racine <ray.racine at gmail.com> wrote:
>
>> Any plans something along the lines of Scala's proposed Value Types.
>>
>> A path:
>>
>> Allow for "special" struct: decls (vstruct: ?) where the parent is a
>> limited subset of non structure parent types (base value types such as
>> String, Number).
>>
>> These special structures MUST contain one and only one value of the
>> parent "special" type or it is a compile error.
>> The structure constructor does not construct the wrapping structure but
>> instead passes through the wrapped value, but *always* invokes the
>> validator during pass-thru.
>> TR treats the type as a subtype of the base value type.
>>
>> e.g.
>>
>> (struct: Identifier String ([value : String])
>>   #:methods gen:validator (lambda (thing) ...) ;; is of type (Any ->
>> Boolean : Identifier))
>>
>> (define id (Identifier "myidentifier")) ;; validator invoked, no
>> structure was allocated, `id' is just a String value, is a subtype of
>> String.
>>
>> (define uc-id (Identifer (string-upcase id))) ;; validator invoked, as id
>> is a just a string no unboxing in (string-upcase id), in fact no
>> allocations here at all.
>>
>> Under the covers the Identifier constructor never creates the structure,
>> it acts as a pass through id : (String -> String) function.  i.e. the
>> runtime representation of `id' is always as a String so any struct <->
>> value boxing / unboxing is avoided.   I think there is enough machinery in
>> place to get pretty close to this.
>>
>> What is gained?
>>
>> What is done internally in TR defining Natural, Index,
>> Exact-Positive-Integer can now be done without special internal defining,
>> just another constrained base type.  One can start to define things like
>> Age [1 .. 120].
>> Another is IMHO a HUGE source of program error is just not enough time to
>> do proper validation at IO boundries where entering data is of the form
>> Strings and Bytes and it needs to be lifted.
>>
>> Consider the following typical use case from Amazon's AWS API, a Tasklist
>> parameter.
>>
>> Parameter - Tasklist : String[1-256]
>>
>> Specifies the task list to poll for activity tasks.
>>
>> The specified string must not start or end with whitespace. It must not
>> contain a : (colon), / (slash), | (vertical bar), or any control characters
>> (\u0000-\u001f | \u007f - \u009f). Also, it must not contain the literal
>> string "arn".
>>
>> Most likely, I'll punt.
>>
>> (: call-it (String ... -> ...))
>> (define (call-it task-list ...)
>>
>> If I'm ambitious today.
>>
>> ;; would prefer (define-new-type Tasklist String)
>> (define-type Tasklist String) ;; tighten things later down the road,
>> <sigh> none type generative
>>
>> (: call-it (Tasklist ... -> ...))
>> (define (call-it task-list ...)
>>
>> What I'd like to do.
>>
>> (define-value-type Tasklist String [1 .. 256] (lambda (this) ....)) ;;
>> mad use of regexp in validator fn (Any -> Boolean : Tasklist)
>>
>> (call-it (Tasklist "mytasklist") ...)
>>
>> (call-it (Tasklist "arn:bad/tasklist") ...)
>>
>> (define-value-type Age Integer [1 .. 120]) ;; no special validation
>> beyond bounds check.
>>
>>
>>
>>
>>
>>
>> ____________________
>>   Racket Users list:
>>   http://lists.racket-lang.org/users
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20121228/7d045c24/attachment-0001.html>

Posted on the users mailing list.