[racket-dev] Semantics of struct-out with except-out

From: J. Ian Johnson (ianj at ccs.neu.edu)
Date: Tue Jul 15 09:23:01 EDT 2014

I'm working on enhancing struct-info to carry field names as symbols to do nice hygienic things:

http://lists.racket-lang.org/users/archive/2014-July/063271.html

I now see that struct-out always provides all field accessors in the static struct-info associated with the struct identifier.
This means the following produces (list 0 1) instead of an error saying that Foo-x is undefined, or something along those lines:

#lang racket/load
(module A racket
  (struct Foo (x y))
  (provide (except-out (struct-out Foo)
                       Foo-x)))
(module B racket
  (require 'A)
  (match (Foo 0 1)
    [(Foo x y) (list x y)]))
(require 'B)

To make struct-out not so greedy about what it provides would require a backwards-incompatible change. The problem then is, should we (I) do it?
Part of me says yes for "intuitive semantics" and part of me says no because the implications are that struct-info values will have to be meticulously checked and rebound to mangled identifiers with new information when passing through provide-specs that can affect struct identifiers.
Should that burden be pushed to future provide-spec implementors? Should it already have been?
The alternative is to provide special syntax in struct-out to do all the "common" provide-spec stuff and still not play nice with other provide-specs.
The upside to this is no name mangling, but the downside is yet more special syntax for what provide-specs should already do, IMHO.

I'm planning to extend struct-out to allow renaming the fields associated with a struct so the following (contrived example) is possible:

#lang racket/load
(module A racket
  (struct Foo (x y))
  (provide (struct-out Foo #:rename-fields ([x y] [y x]))))
(module B racket
  (require 'A)
  (match (Foo 0 1)
    [(Foo [#:x 0] [#:y 1]) #f]
    [(Foo [#:x 1] [#:y 0]) #t]))
(require 'B)
;=> #t

The #:rename-fields is pseudo-code. Ideally we want to provide any or all struct identifiers with or without renaming.
Fields are special to the struct-info itself, so that will need to be tied to struct-out.
If a struct's field accessor is not provided, that means it is not in the struct-info. That means match looks weird and asymmetric, but so be it:

#lang racket/load
(module A racket
  (struct Foo (x y))
  (provide (except-out (struct-out Foo)
                       Foo-x)))
(module B racket
  (require 'A)
  (match (Foo 0 1)
    [(Foo y) y]))
(require 'B)
;=> 1

If we don't want it to be asymmetric but instead require that positionally it is uninspected (so that the unsafe-struct-ref logic in match still works), that
would require a representation change for struct-info. Instead of accessors being given as (list id ... #f) (optional #f for missing super type info), they would be given as (list id-or-#f ... 'unknown) (optional 'unknown for missing super type info).
The 'unknown instead of #f is probably the only thing that would break existing code.
If they are using except-out with struct-out and depending on a full struct-info, they have a bug, IMHO.
I'm willing to update the current codebase to the new representation.

I have a feature branch with everything in it except any changes to struct-out. I did have to make changes to providing contracted structs, due to name mangling needing to carry forward my extended-struct-info. Foreshadowing.

https://github.com/ianj/racket/tree/match-named

Comments welcome.
Thanks,
-Ian

Posted on the dev mailing list.