[racket-dev] submodules

From: Matthias Felleisen (matthias at ccs.neu.edu)
Date: Thu Mar 8 17:23:19 EST 2012

Nice -- just what I wished for last spring. -- Matthias



On Mar 8, 2012, at 3:39 PM, Jay McCarthy wrote:

> One more thing, I anticipate that the 'main' module in my "test.rkt"
> will be "raco test" and I would extend it to allow you to give a
> directory that it will require (if present) all the "test" modules.
> You could also have "Test" button in DrRacket.
> 
> Jay
> 
> On Thu, Mar 8, 2012 at 1:29 PM, Jay McCarthy <jay.mccarthy at gmail.com> wrote:
>> I've made a test collecting macro.
>> 
>> https://gist.github.com/2003201
>> 
>> "test.rkt" gives you 'define-test'
>> 
>> (define-test id e ...)
>> 
>> will create a module named 'test' that can see you local bindings
>> (like module* #f) at the end of the module that contains all the code
>> in "e ...". In addition, you get the (id e ...) form that adds the
>> given expressions to the test module.
>> 
>> I expect most uses will look like:
>> 
>> (require racket/test)
>> (define-test test (require rackunit))
>> 
>> ....
>> 
>> (define f ...)
>> (test ... f tests ...)
>> 
>> ....
>> 
>> (define g ...)
>> (test ... g tests ...)
>> 
>> Jay
>> 
>> On Wed, Mar 7, 2012 at 12:07 PM, Jay McCarthy <jay.mccarthy at gmail.com> wrote:
>>> I love it---especially for the test collecting macro.
>>> 
>>> I will try to write it and report back.
>>> 
>>> Jay
>>> 
>>> On Wed, Mar 7, 2012 at 10:14 AM, Matthew Flatt <mflatt at cs.utah.edu> wrote:
>>>> I've added "submodules" to a version of Racket labeled v5.2.900.1
>>>> that's here:
>>>> 
>>>>  https://github.com/mflatt/submodules
>>>> 
>>>> After we've sorted out any controversial parts of the design and after
>>>> the documentation is complete, then I'll be ready to merge to the main
>>>> Racket repo.
>>>> 
>>>> 
>>>> Why Submodules?
>>>> ---------------
>>>> 
>>>> Using submodules, you can abstract (via macros) over a set of modules
>>>> that have distinct dynamic extents and/or bytecode load times. You can
>>>> also get a private communication channel (via binding) from a module
>>>> to its submodules.
>>>> 
>>>> Some uses:
>>>> 
>>>>  * When you run a module via `racket', if it has a `main' submodule,
>>>>   then the `main' module is instantiated --- but not the `main'
>>>>   submodules of any other modules used by the starting module.  This
>>>>   protocol is implemented for `racket', but not yet for DrRacket.
>>>> 
>>>>  * Languages with separate read-time, configure-time, and run-time
>>>>   code can be defined in a single module, with the configure-time and
>>>>   read-time code in submodules.
>>>> 
>>>>  * A testing macro could collect test cases and put them into a
>>>>   separate `test' submodule', so that testing code is not run or even
>>>>   loaded when the module is used normally.
>>>> 
>>>>  * An improved `scribble/srcdoc' can expose documentation through a
>>>>   submodule instead of through re-expansion hacks.
>>>> 
>>>>  * If you want to export certain of a module's bindings only to when
>>>>   explicitly requested (i.e., not when the module is `require'd
>>>>   normally), you can export the bindings from a submodule, instead.
>>>> 
>>>> When I first started talking about these problems last summer, I
>>>> called the solution sketch "facets" or "modulets", but the design
>>>> has evolved into "submodules".
>>>> 
>>>> 
>>>> Nesting `module'
>>>> ----------------
>>>> 
>>>> Given the term "submodule", the first thing that you're likely to try
>>>> will work as expected:
>>>> 
>>>>  #lang racket/base
>>>> 
>>>>  (module zoo racket/base
>>>>    (provide tiger)
>>>>    (define tiger "Tony"))
>>>> 
>>>>  (require 'zoo)
>>>> 
>>>>  tiger
>>>> 
>>>> Within `module', a module path of the form `(quote id)' refers to the
>>>> submodule `id', if any. If there's no such submodule, then `(quote
>>>> id)' refers to an interactively declared module, as before.
>>>> 
>>>> Submodules can be nested. To access a submodule from outside the
>>>> enclosing module, use the `submod' module path form:
>>>> 
>>>>  #lang racket/base
>>>> 
>>>>  (module zoo racket/base
>>>>    (module monkey-house racket/base
>>>>      (provide monkey)
>>>>      (define monkey "Curious George"))
>>>>    (displayln "Ticket, please"))
>>>> 
>>>>  (require (submod 'zoo monkey-house))
>>>> 
>>>>  monkey
>>>> 
>>>> The 'zoo module path above is really a shorthand for `(submod "."
>>>> zoo)', where "." means the enclosing module and `zoo' is its
>>>> submodule. You could write `(submod "." zoo monkey-house)' in
>>>> place of `(submod 'zoo monkey-house)'.
>>>> 
>>>> Note that `zoo' and `monkey-house' are not bound as identifiers in the
>>>> module above --- just like `module' doesn't add any top-level
>>>> bindings. The namespace of modules remains separate from the namespace
>>>> of variables and syntax. Along those lines, submodules are not
>>>> explicitly exported, because they are implicitly public.
>>>> 
>>>> When you run the above program, "Ticket, please" is *not* displayed.
>>>> Unless a module `require's a submodule, instantiating the module does
>>>> not instantiate the submodule. Similarly, instantiating a submodule
>>>> does not imply instantiating its enclosing module.
>>>> 
>>>> Furthermore, if you compile the above example to bytecode and run it,
>>>> the bytecode for `zoo' is not loaded. Only the bytecode for the
>>>> top-level module and `monkey-house' is loaded.
>>>> 
>>>> 
>>>> Nesting `module*'
>>>> -----------------
>>>> 
>>>> Submodules declared with `module' are declared locally while expanding
>>>> a module body, which means that the submodules can be `require'd
>>>> afterward by the enclosing module. This ordering means, however, that
>>>> the submodule cannot `require' the enclosing module. The submodule
>>>> also sees no bindings of the enclosing module; it starts with an empty
>>>> lexical context.
>>>> 
>>>> The `module*' form is like `module', but it can be used only for
>>>> submodules, and it defers the submodule's expansion until after the
>>>> enclosing module is otherwise expanded. As a result, a submodule using
>>>> `module*' can `require' its enclosing module, while the enclosing
>>>> module cannot require the submodule.
>>>> 
>>>> A ".." in a `submod' form goes up the submodule hierarchy, so that
>>>> `(submod "." "..")' is a reference to the enclosing module:
>>>> 
>>>>  #lang racket/base
>>>> 
>>>>  (module aquarium racket/base
>>>>    (provide fish)
>>>>    (define fish '(1 2))
>>>> 
>>>>    (module* book racket/base
>>>>      (require (submod "." ".."))
>>>>      (append fish '(red blue))))
>>>> 
>>>>  (require (submod 'aquarium book))
>>>> 
>>>> Instead of `require'ing its enclosing module, a `module*' form can use
>>>> `#f' as its language, in which case its lexical context starts with
>>>> all of the bindings of the enclosing module (implicitly imported)
>>>> instead of with an empty lexical context. As a result, the submodule
>>>> can access bindings of the enclosing module that are not exported:
>>>> 
>>>>  #lang racket/base
>>>> 
>>>>  (module aquarium racket/base
>>>>    (define fish '(1 2))
>>>> 
>>>>    (module* book #f
>>>>      (append fish '(red blue))))
>>>> 
>>>>  (require (submod 'aquarium book))
>>>> 
>>>> A common use of `module*' is likely to be with `main', since `racket'
>>>> will load a `main' submodule (after `require'ing its enclosing module)
>>>> for a module named on its command line. For example, if you run this
>>>> program via `racket':
>>>> 
>>>>  #lang racket/base
>>>> 
>>>>  (provide fish)
>>>>  (define fish '(1 2))
>>>> 
>>>>  (module* main #f
>>>>    (unless (apply < fish)
>>>>      (error "fish are not sorted")))
>>>> 
>>>> then you get a "fish are not sorted" error, but if you `require' the
>>>> file into another program, you get a `fish' binding with no error.
>>>> 
>>>> 
>>>> The new `#lang'
>>>> ---------------
>>>> 
>>>> The `#lang' reader form was previously defined as a shorthand for
>>>> `#reader' where the name after the `#lang' is mangled by adding
>>>> "/lang/reader".  With submodules, `#lang' first tries using the name
>>>> as-is and checking for a `reader' submodule; if it is found, then the
>>>> submodule is used instead of mangling the name with "/lang/reader",
>>>> otherwise it falls back to the old behavior.
>>>> 
>>>> So, if you want to define an `ocean' language that is `racket/base'
>>>> plus `fish', it's enough to install the following module as "main.rkt"
>>>> in an "ocean" collection:
>>>> 
>>>>  #lang racket/base
>>>> 
>>>>  (provide (all-from-out racket/base)
>>>>           fish)
>>>>  (define fish '(1 2 3))
>>>> 
>>>>  (module reader syntax/module-reader
>>>>    #:language 'ocean)
>>>> 
>>>> 
>>>> Backwards Incompatibility
>>>> -------------------------
>>>> 
>>>> The biggest incompatibility is that `resolved-module-path-name' can
>>>> return a list when the module path refers to a submodule, in addition
>>>> to the old path and symbol results. Most code that calls
>>>> `resolved-module-path-name' will have to be updated.
>>>> 
>>>> The `submod' form is a new primitive module-path form, so module name
>>>> resolvers also must be updated.  Finally, a load/use-compiled handler
>>>> must accept a list as the expected-module name, which usually
>>>> indicates that a submodule is being loaded; the list can start with
>>>> `#f' to indicate that the module should only be loaded if it can be
>>>> loaded independently from bytecode (i.e., without triggering the
>>>> declaration of any other submodule, which means not loading from
>>>> source). Furthermore, when a submodule is requested, no error should
>>>> be raised if the enclosing module is unavailable, which allows
>>>> speculative checking for submodule declarations.
>>>> 
>>>> The bytecode format has changed, and the `mod' structure type from
>>>> `compiler/zo-parse' has two new fields: one for "pre" submodules
>>>> (i.e., those declared with `module') and one for "post" submodules
>>>> (i.e., those declared with `module*'). Any code that uses
>>>> `compiler/zo-parse' will have to change.
>>>> 
>>>> If you compile a `module' form and it has submodules, then when you
>>>> write the bytecode, all of the modules are written together. If the
>>>> `module' is not inside a larger top-level sequence, then the printed
>>>> form starts with a table that can be used to find any individual
>>>> submodule, which is how independent loading of submodules works. If
>>>> you just `read' the table in, though, it returns a compiled-module
>>>> value that contains submodules, and `eval'ing the compiled module
>>>> declares all the submodules, too. This protocol makes lots of
>>>> `compile' and `eval' code work without modification. The
>>>> `get-module-code' function from `syntax/modcode', meanwhile, gives you
>>>> more control, along with functions like module-compiled-submodules' to
>>>> get or adjust the submodule list in a compiled-module value.
>>>> 
>>>> 
>>>> Design Issues
>>>> -------------
>>>> 
>>>> The `submod' syntax --- especially "." and ".." --- is arbitrary. The
>>>> `submod' name isn't great, but I like it the best among the options
>>>> that I tried.  I'm not sure whether the association of "."  and ".."
>>>> to filesystem paths is helpfully mnemonic or unhelpfully
>>>> confusing. The handling of `quote' paths within a module is also
>>>> arbitrary, but it's intended to smooth the connection between the top
>>>> level and a module body.
>>>> 
>>>> Overloading `module' for submodules is questionable; again, though, I
>>>> like how it roughly matches interactive evaluation. For the
>>>> post-submodule form, then, `module*' seems like the obvious
>>>> choice.
>>>> 
>>>> As things stand, the ugly pattern `(module* main #f ...)'  would be
>>>> common. Probably we should have a macro that expands to `(module* main
>>>> #f ...)'. Should the macro be called `main'?
>>>> 
>>>> I haven't tried to build a test-collecting macro or a
>>>> `scribble/srcdoc' replacement. I think they will work with this
>>>> submodule design, but I can't be sure until we try it.
>>>> 
>>>> _________________________
>>>>  Racket Developers list:
>>>>  http://lists.racket-lang.org/dev
>>> 
>>> 
>>> 
>>> --
>>> Jay McCarthy <jay at cs.byu.edu>
>>> Assistant Professor / Brigham Young University
>>> http://faculty.cs.byu.edu/~jay
>>> 
>>> "The glory of God is Intelligence" - D&C 93
>> 
>> 
>> 
>> --
>> Jay McCarthy <jay at cs.byu.edu>
>> Assistant Professor / Brigham Young University
>> http://faculty.cs.byu.edu/~jay
>> 
>> "The glory of God is Intelligence" - D&C 93
> 
> 
> 
> -- 
> Jay McCarthy <jay at cs.byu.edu>
> Assistant Professor / Brigham Young University
> http://faculty.cs.byu.edu/~jay
> 
> "The glory of God is Intelligence" - D&C 93
> 
> _________________________
>  Racket Developers list:
>  http://lists.racket-lang.org/dev



Posted on the dev mailing list.