[racket] Puzzled about type inference

From: Alexander D. Knauth (alexander at knauth.org)
Date: Tue Aug 5 09:11:52 EDT 2014

First of all, thank you all of you, I have had similar problems with not knowing what to annotate and what not to.  

On Aug 5, 2014, at 5:36 AM, Norman Gray <norman at astro.gla.ac.uk> wrote:

> 
>> If I were to move this code to Typed Racket, I'd add a type declaration to each identifier just like in any other (explicitly, statically) typed programming language. That's what I did, and there was no problem checking and running your code. 
> 
> So the pattern I see there, which both you and Sam have illustrated, is:
> 
>  * annotate as far 'down' the tree as possible, so that type information can propagate upwards in the parse tree
> 
>  * If one liberally uses named bindings (recommended on other style grounds, in any case), then each of these is a natural location for a type annotation
> 
>  * when a polymorphic-typed function takes polymorphic arguments, annotate the arguments, not the function (that is, I see that, in (map car '(foo ...))  both you and Sam annotated the car, not the map); this is in harmony with 'annotate as far down as possible', for some value of 'down'.
> 
> Sam said:
> 
>> The most signficant differences are (1) I'm not using `cast`, which
>> should be avoided when possible. `cast` is a runtime operation, and
>> might be hiding arbitrary bugs. (2) I've annotated the `let`-bound
>> variables, which makes life easier on the rest of the type checker and
>> helps improve error messages.
> 
> Ah, now this is something I've meant to ask about: what's the difference between ann and cast?

The difference is that cast checks the type with a contract (even if it already has that type or some subtype), while ann checks at type check time that it has the type given (or some subtype) and gives a compile-time type-check error if it doesn’t.  

> 
> The manual says:
> 
>> (ann e t)
>> Ensure that e has type t, or some subtype. The entire expression has type t. This is legal only in expression contexts.
> 
>> (cast e t)
>> The entire expression has the type t, while e may have any type. The value of the entire expression is the value returned by e, protected by a contract ensuring that it has type t. This is legal only in expression contexts.
> 
> I've never been sure whether the difference between "Ensure that e has type t" (which sounds rather runtime-y) and "The entire expression has the type t" is significant.  The only difference I could parse from this is that `cast` includes a contract, which would appear to make it _safer_ than `ann`, in the sense that if this were accidentally given a wrong-typed value than it would cause an immediate error.

Ann doesn’t do anything at runtime.
And cast is not safer than ann; ann gives you a type-check error at compile-time, while cast gives you a contract violation at runtime.  
And actually I think cast could be considered slightly unsafe because there are ways of using it that can fool the type-checker into thinking values have types that they don’t have:
http://bugs.racket-lang.org/query/?cmd=view%20audit-trail&database=default&pr=13626&return_url=http%3A%2F%2Fbugs.racket-lang.org%2Fquery%2F%3Fdatabase%3Ddefault%3Bdebug%3D%3BState%3Dany%3Bignoreclosed%3DIgnore%2520Closed%3BSynopsis%3Dcast%2520is%2520unsound%3Bmultitext%3D%3Bcolumns%3DState%3Bcolumns%3DSynopsis%3Bcolumns%3DCategory%3Bcolumns%3DLast-Modified%3Bcolumns%3DRelease%3Bcmd%3Dsubmit%2520query%3Bsortby%3DNumber

> 
> I presume that the difference is that `ann` is about static type analysis, whereas `cast` is for the situation where that static analysis can't be done.  Would I be correct that in the above case I can and should write
> 
>     (ann '(#(1 2) #(3 4) #(5 6)) (Listof VNI)
> 
> since this is saying that this expression has a more specific type than the TC would work out by itself, and it can verify that the restriction is possible.  In the real program, these numbers appear from a SQL query, and so this list is actually of type (Listof (Vectorof (U Integer String))), so that I should indeed be writing
> 
>    (cast (query-rows ...) (Listof VNI))
> 
> because these numbers are coming in, dynamically, from _outside_ the reach of the static TC, and so the annotation needs to be dynamically verified at this point.  That is: `cast` is for the 'boundary' of the type system, whereas `ann` is for adding detail within it?

Well ann can’t add any detail, it can only make types less specific.  
Although it can be useful to solidify the types of mutable things such as vectors and boxes, and I’ve found that sometimes it seems to be necessary for annotating the types of sequences in for loops.  
And for mutable things like vectors and boxes, sometimes you’ll want a less specific type.  
And it can be useful for debugging so that it gives you a type-check error if something doesn’t have the type you think it does.  
Though there might be uses of it that I don’t know about yet.  

> 
> Ah -- I think I've just had a little epiphany.  In which case, it might be useful to transplant a little of that into the ann/cast section of the manual.  The examples shown there don't really bring out the distinction to the naive reader.

I agree.  I had trouble with this too when I started trying to do stuff in typed racket.  

> (Just by the way, `assert` seems to be doing rather similar work to these two, and while I can appreciate a distinction, from the text, I'd struggle to express it compactly -- a compare-and-contrast explanation in the manual would be very illuminating)
> 
> (Does this imply that when TR code is called from untyped code, the arguments are implicitly `cast` at call-time?)
> 
>> There are a few things that make this program require more annotations
>> than I'd like.
>> 
>> 1. TR doesn't handle automatically inferring type arguments when
>> polymorphic functions (like `map`) are given polymorphic arguments
>> like `vector->list`. While fixing this would require work, it just
>> needs to be done -- there's no reason we shouldn't be able to make
>> this work.
> 
> Righto -- that's what I wondered, whether it was 'just a matter of code' or whether there was something fundamentally more complicated about this, which I wasn't understanding.
> 
> Thanks, both, for your careful explanations.
> 
> Best wishes,
> 
> Norman
> 
> 
> -- 
> Norman Gray  :  http://nxg.me.uk
> SUPA School of Physics and Astronomy, University of Glasgow, UK
> 
> 
> ____________________
>  Racket Users list:
>  http://lists.racket-lang.org/users



Posted on the users mailing list.