[racket] Struct fields in struct-info + match enhancements

From: Sam Tobin-Hochstadt (samth at cs.indiana.edu)
Date: Wed Jul 9 14:32:41 EDT 2014

Ian,

I don't understand where you're going with this at all. We shouldn't
build anything new on top of unhygenically pasting bits of structs
together. Instead, we should extend the static struct info to contain
the names of fields, like I suggested. As you point out, these aren't
bound names, they're symbols, and should be represented that way (and
treated that way by `struct-copy`/`match`). If we then separately want
to add a means of renaming these fields on export, then it's just a
change to the `struct-out` form which provides a different bit of
static struct info, with different names.

Sam

On Tue, Jul 8, 2014 at 11:56 PM, J. Ian Johnson <ianj at ccs.neu.edu> wrote:
> I finally started working on this, and I've run into a snag.
> tl;dr hygiene is hard, let's go shopping.
>
> The idea is that a field identifier will match positionally with its accessor identifier in the extended-struct-info. A simple linear scan with free-identifier=? should work.
> However, field identifiers are not defined by the struct form, so we cannot meaningfully refer to them across modules. They are an aspect of a name, but not names themselves.
> Defining them is also a non-starter. Consider (struct Foo (common-name)). If a module using Foo defines common-name, there is then a collision.
>
> This problem appears unsolvable at first. Struct field accessors feel like a "compound" identifier of both the struct name and a field name, but that isn't the case in reality.
>
> struct-copy has problematic hygiene for a related reason that I originally thought a simple error. But now the truth of the problem is emerging.
>
> Let's play with some litmus tests, shall we? The match enhancement I proposed will need to find struct accessor names in the same way that struct-copy does, so let's look at the thing that already exists and toy around with possible alternatives.
>
> Problem 1:
> #lang racket/load
> (module A racket
>   (struct Foo (x) #:transparent)
>   (provide Foo (rename-out [Foo-x Foo-y])))
>
> (module B racket
>   (require 'A)
>   (equal? (struct-copy Foo (Foo 0) [y 1])
>           (Foo 1)))
>
> (require 'B)
> ;=> #t
>
> Foo doesn't really have a field "y," but the hygiene violation of struct-copy allows it. Should it? I'm not even sure. I would want to say that Foo's x gets renamed to y, not Foo-x is now Foo-y. The accessor functions are a nicety on top of make-struct-type however. There is no intrinsic connection between the struct and all these defined functions.
>
> Problem 2:
> What I would care to see and would expect when pushing the boundaries of hygiene, is the acceptance of the following program:
>
> (module C racket
>   (struct Bar (y) #:transparent)
>   (provide Bar-y (rename-out [Bar Foo])))
>
> (module D racket
>   (require 'C)
>   (struct-copy Foo (Foo 0) [y 1])) ;; syntax error
>
> We might expect that since Foo is really just Bar in Foo's clothing, the y accessor should be findable, right?
> I mean, struct-copy is at least sane enough to check that the generated accessor identifier is associated with the struct-info so the following is still invalid:
>
> #lang racket/load
> (module α racket
>   (struct Bar (y))
>   (define (Foo-y x) (add1 (Bar-y x)))
>   (provide Bar-y Foo-y (rename-out [Bar Foo])))
>
> (module β racket
>   (require 'α)
>   (struct-copy Foo (Foo 0) [y 1])) ;; syntax error
>
> The identifier shouldn't be generated though. This additional check shouldn't be necessary. It's a band-aid on the wart.
>
> If the struct accessors are not provided, then struct-copy may not do its job. The error message in this program is not quite right (the identifier is associated, just not available)
> #lang racket/load
> (module A racket
>   (struct Foo (x) #:transparent)
>   (provide Foo))
>
> (module B racket
>   (require 'A)
>   (equal? (struct-copy Foo (Foo 0) [x 1]) (Foo 1))) ;; wonky syntax error
>
> How about if the identifier is available, but not textually what we're expecting (unhygienically)?
> That is, Foo's extended-struct-info would have field x associated with #'Foo-x, which is free-identifier=? to #'Qux.
>
> #lang racket/load
> (module A racket
>   (struct Foo (x) #:transparent)
>   (provide Foo (rename-out [Foo-x Qux]))
>
> (module B racket
>   (require 'A)
>   (equal? (struct-copy Foo (Foo 0) [x 1]) (Foo 1))) ;; again syntax error
>
>
> The best solution I can imagine is for a mechanism for require and provide to coordinate information about compound identifiers. For structs, renaming Foo-x to Foo-y is not an admission that field x is to be known as field y for Foo. The field is still x, as that is how it is provided, as long as the provision doesn't rename the field as well.
> The fields aren't defined identifiers in the traditional sense. They only have meaning in relation to their owners, the structs. They will behave differently if say x is a field of Foo and x is also a provided identifier.
>
> Instead of simply (struct-out struct-name), perhaps
> (struct-out struct-form fields-op)
> struct-form ::= struct-name-form
>               | #:no-constructor struct-name-form
> struct-name-form ::= struct-name
>                    | (old-struct-name new-struct-name)
> fields-op ::=
>             | #:only-fields [fld-form ...]
> fld-form ::= field-name-form
>            | [#:no-mutator field-name-form]
>            | [#:no-accessor field-name-form]
> fld-name-form ::= field-name
>                 | (old-field-name new-field-name)
>
> I'm not entirely sure how to get this all to work. I'd need some mflatt guidance.
> -Ian
> ----- Original Message -----
> From: "Sam Tobin-Hochstadt" <samth at cs.indiana.edu>
> To: "J. Ian Johnson" <ianj at ccs.neu.edu>
> Cc: "users" <users at racket-lang.org>
> Sent: Wednesday, May 14, 2014 1:18:58 PM GMT -05:00 US/Canada Eastern
> Subject: Re: [racket] Struct fields in struct-info + match enhancements
>
> Here's my suggestion:
>
> - We add an `extract-extended-struct-info` procedure which always
> produces a struct that implements `prop:struct-info` (it's easy to use
> a simply struct to wrap the list when needed)
> - We change `struct` & `define-struct` to use a structure implementing
> `prop:struct-info`, but with an additional field holding the list you
> want (or perhaps this should be a new structure property)
> - When `match` encounters a struct reference, it uses
> `extract-extended-struct-info`, checks if it's an instance of this
> extended struct, and if it is, uses that info.
> - When `match` encounters a struct reference that doesn't support the
> new behavior, the sort of keyword syntax you describe is a syntax
> error.
>
> I think this (a) preserves backwards compatibility (b) supports future
> extension (c) accomplishes what you need.
>
> Sam
>
> On Wed, May 14, 2014 at 1:08 PM, J. Ian Johnson <ianj at ccs.neu.edu> wrote:
>> I'm talking about the transformer binding. Specifically the result of extract-struct-info that both struct-copy and match use in their expansion, which produces a six-element list that satisfies struct-info?.
>> This six-element list has
>> optional identifier bound to type descriptor
>> optional identifier bound to type constructor
>> optional identifier bound to type predicate
>> list of field accessor identifiers (optional last value of #f)
>> list of optional field mutator identifiers
>> optional super type identifier
>>
>> I want a 7th element that is the list of field identifiers themselves as given to the struct form or the define-signature form (which I believe produces the expected struct-info transformer binding, so I don't have to actually change define-signature). The expectation is that the field names and the field accessor names correspond 1-to-1.
>>
>> 7 is not 6, thus backwards-incompatible.
>> I know many programs use positional accessors rather than match on the whole structure of the list, so I imagine adding a 7th element is not very intrusive.
>> -Ian
>> ----- Original Message -----
>> From: "Sam Tobin-Hochstadt" <samth at cs.indiana.edu>
>> To: "J. Ian Johnson" <ianj at ccs.neu.edu>
>> Cc: "users" <users at racket-lang.org>
>> Sent: Wednesday, May 14, 2014 12:42:56 PM GMT -05:00 US/Canada Eastern
>> Subject: Re: [racket] Struct fields in struct-info + match enhancements
>>
>> On Wed, May 14, 2014 at 12:35 PM, J. Ian Johnson <ianj at ccs.neu.edu> wrote:
>>> tl;dr if you use struct-info in your programs, I might break them. Please continue reading.
>>>
>>> I had a PR a while ago suggesting a change to struct-copy due to its unhygienic nature with fields. It did not go through since there wasn't enough information in the struct-info to separate the struct-name and the field-name. Because struct-info does not have a procedural-only interface, changing it to instead or also hold the individual field identifiers would be backwards incompatible. However, I also expect that struct-info manipulation outside of core Racket is rare.
>>>
>>> Is there anyone out there that would be affected by a this change that would be unwilling to make slight modifications to support the new struct-info?
>>> I ask not because of struct-copy itself, but for an additional enhancement to racket/match: named field selection from structs instead of positional only.
>>> I'm getting bitten by pervasive refactoring woes whenever I add fields to structs. All of my match patterns must change to have an extra _ somewhere.
>>
>> I don't understand why this would require a backwards-incompatible
>> change to struct-info.
>>
>> Also, this discussion is confusing because it's not clear whether you
>> mean the dynamic value produced by the `struct-info` procedure, or the
>> structure type transformer binding. I think you mean the latter, in
>> which case I expect you could do what you want by implementing
>> `prop:struct-info` appropriately with an extended structure, and
>> handling existing values (such as six-element lists) appropriately
>> with defaults.
>>
>> Sam


Posted on the users mailing list.