[racket] Struct fields in struct-info + match enhancements

From: J. Ian Johnson (ianj at ccs.neu.edu)
Date: Tue Jul 8 23:56:42 EDT 2014

I finally started working on this, and I've run into a snag.
tl;dr hygiene is hard, let's go shopping.

The idea is that a field identifier will match positionally with its accessor identifier in the extended-struct-info. A simple linear scan with free-identifier=? should work.
However, field identifiers are not defined by the struct form, so we cannot meaningfully refer to them across modules. They are an aspect of a name, but not names themselves.
Defining them is also a non-starter. Consider (struct Foo (common-name)). If a module using Foo defines common-name, there is then a collision.

This problem appears unsolvable at first. Struct field accessors feel like a "compound" identifier of both the struct name and a field name, but that isn't the case in reality.

struct-copy has problematic hygiene for a related reason that I originally thought a simple error. But now the truth of the problem is emerging.

Let's play with some litmus tests, shall we? The match enhancement I proposed will need to find struct accessor names in the same way that struct-copy does, so let's look at the thing that already exists and toy around with possible alternatives.

Problem 1:
#lang racket/load
(module A racket
  (struct Foo (x) #:transparent)
  (provide Foo (rename-out [Foo-x Foo-y])))

(module B racket
  (require 'A)
  (equal? (struct-copy Foo (Foo 0) [y 1])
          (Foo 1)))

(require 'B)
;=> #t

Foo doesn't really have a field "y," but the hygiene violation of struct-copy allows it. Should it? I'm not even sure. I would want to say that Foo's x gets renamed to y, not Foo-x is now Foo-y. The accessor functions are a nicety on top of make-struct-type however. There is no intrinsic connection between the struct and all these defined functions.

Problem 2:
What I would care to see and would expect when pushing the boundaries of hygiene, is the acceptance of the following program:

(module C racket
  (struct Bar (y) #:transparent)
  (provide Bar-y (rename-out [Bar Foo])))

(module D racket
  (require 'C)
  (struct-copy Foo (Foo 0) [y 1])) ;; syntax error

We might expect that since Foo is really just Bar in Foo's clothing, the y accessor should be findable, right?
I mean, struct-copy is at least sane enough to check that the generated accessor identifier is associated with the struct-info so the following is still invalid:

#lang racket/load
(module α racket
  (struct Bar (y))
  (define (Foo-y x) (add1 (Bar-y x)))
  (provide Bar-y Foo-y (rename-out [Bar Foo])))

(module β racket
  (require 'α)
  (struct-copy Foo (Foo 0) [y 1])) ;; syntax error

The identifier shouldn't be generated though. This additional check shouldn't be necessary. It's a band-aid on the wart.

If the struct accessors are not provided, then struct-copy may not do its job. The error message in this program is not quite right (the identifier is associated, just not available)
#lang racket/load
(module A racket
  (struct Foo (x) #:transparent)
  (provide Foo))

(module B racket
  (require 'A)
  (equal? (struct-copy Foo (Foo 0) [x 1]) (Foo 1))) ;; wonky syntax error

How about if the identifier is available, but not textually what we're expecting (unhygienically)?
That is, Foo's extended-struct-info would have field x associated with #'Foo-x, which is free-identifier=? to #'Qux.

#lang racket/load
(module A racket
  (struct Foo (x) #:transparent)
  (provide Foo (rename-out [Foo-x Qux]))

(module B racket
  (require 'A)
  (equal? (struct-copy Foo (Foo 0) [x 1]) (Foo 1))) ;; again syntax error


The best solution I can imagine is for a mechanism for require and provide to coordinate information about compound identifiers. For structs, renaming Foo-x to Foo-y is not an admission that field x is to be known as field y for Foo. The field is still x, as that is how it is provided, as long as the provision doesn't rename the field as well.
The fields aren't defined identifiers in the traditional sense. They only have meaning in relation to their owners, the structs. They will behave differently if say x is a field of Foo and x is also a provided identifier.

Instead of simply (struct-out struct-name), perhaps
(struct-out struct-form fields-op)
struct-form ::= struct-name-form
              | #:no-constructor struct-name-form
struct-name-form ::= struct-name
                   | (old-struct-name new-struct-name)
fields-op ::=
            | #:only-fields [fld-form ...]
fld-form ::= field-name-form
           | [#:no-mutator field-name-form]
           | [#:no-accessor field-name-form]
fld-name-form ::= field-name
                | (old-field-name new-field-name)

I'm not entirely sure how to get this all to work. I'd need some mflatt guidance.
-Ian
----- Original Message -----
From: "Sam Tobin-Hochstadt" <samth at cs.indiana.edu>
To: "J. Ian Johnson" <ianj at ccs.neu.edu>
Cc: "users" <users at racket-lang.org>
Sent: Wednesday, May 14, 2014 1:18:58 PM GMT -05:00 US/Canada Eastern
Subject: Re: [racket] Struct fields in struct-info + match enhancements

Here's my suggestion:

- We add an `extract-extended-struct-info` procedure which always
produces a struct that implements `prop:struct-info` (it's easy to use
a simply struct to wrap the list when needed)
- We change `struct` & `define-struct` to use a structure implementing
`prop:struct-info`, but with an additional field holding the list you
want (or perhaps this should be a new structure property)
- When `match` encounters a struct reference, it uses
`extract-extended-struct-info`, checks if it's an instance of this
extended struct, and if it is, uses that info.
- When `match` encounters a struct reference that doesn't support the
new behavior, the sort of keyword syntax you describe is a syntax
error.

I think this (a) preserves backwards compatibility (b) supports future
extension (c) accomplishes what you need.

Sam

On Wed, May 14, 2014 at 1:08 PM, J. Ian Johnson <ianj at ccs.neu.edu> wrote:
> I'm talking about the transformer binding. Specifically the result of extract-struct-info that both struct-copy and match use in their expansion, which produces a six-element list that satisfies struct-info?.
> This six-element list has
> optional identifier bound to type descriptor
> optional identifier bound to type constructor
> optional identifier bound to type predicate
> list of field accessor identifiers (optional last value of #f)
> list of optional field mutator identifiers
> optional super type identifier
>
> I want a 7th element that is the list of field identifiers themselves as given to the struct form or the define-signature form (which I believe produces the expected struct-info transformer binding, so I don't have to actually change define-signature). The expectation is that the field names and the field accessor names correspond 1-to-1.
>
> 7 is not 6, thus backwards-incompatible.
> I know many programs use positional accessors rather than match on the whole structure of the list, so I imagine adding a 7th element is not very intrusive.
> -Ian
> ----- Original Message -----
> From: "Sam Tobin-Hochstadt" <samth at cs.indiana.edu>
> To: "J. Ian Johnson" <ianj at ccs.neu.edu>
> Cc: "users" <users at racket-lang.org>
> Sent: Wednesday, May 14, 2014 12:42:56 PM GMT -05:00 US/Canada Eastern
> Subject: Re: [racket] Struct fields in struct-info + match enhancements
>
> On Wed, May 14, 2014 at 12:35 PM, J. Ian Johnson <ianj at ccs.neu.edu> wrote:
>> tl;dr if you use struct-info in your programs, I might break them. Please continue reading.
>>
>> I had a PR a while ago suggesting a change to struct-copy due to its unhygienic nature with fields. It did not go through since there wasn't enough information in the struct-info to separate the struct-name and the field-name. Because struct-info does not have a procedural-only interface, changing it to instead or also hold the individual field identifiers would be backwards incompatible. However, I also expect that struct-info manipulation outside of core Racket is rare.
>>
>> Is there anyone out there that would be affected by a this change that would be unwilling to make slight modifications to support the new struct-info?
>> I ask not because of struct-copy itself, but for an additional enhancement to racket/match: named field selection from structs instead of positional only.
>> I'm getting bitten by pervasive refactoring woes whenever I add fields to structs. All of my match patterns must change to have an extra _ somewhere.
>
> I don't understand why this would require a backwards-incompatible
> change to struct-info.
>
> Also, this discussion is confusing because it's not clear whether you
> mean the dynamic value produced by the `struct-info` procedure, or the
> structure type transformer binding. I think you mean the latter, in
> which case I expect you could do what you want by implementing
> `prop:struct-info` appropriately with an extended structure, and
> handling existing values (such as six-element lists) appropriately
> with defaults.
>
> Sam


Posted on the users mailing list.