[racket-dev] submodules
I love it---especially for the test collecting macro.
I will try to write it and report back.
Jay
On Wed, Mar 7, 2012 at 10:14 AM, Matthew Flatt <mflatt at cs.utah.edu> wrote:
> I've added "submodules" to a version of Racket labeled v5.2.900.1
> that's here:
>
> https://github.com/mflatt/submodules
>
> After we've sorted out any controversial parts of the design and after
> the documentation is complete, then I'll be ready to merge to the main
> Racket repo.
>
>
> Why Submodules?
> ---------------
>
> Using submodules, you can abstract (via macros) over a set of modules
> that have distinct dynamic extents and/or bytecode load times. You can
> also get a private communication channel (via binding) from a module
> to its submodules.
>
> Some uses:
>
> * When you run a module via `racket', if it has a `main' submodule,
> then the `main' module is instantiated --- but not the `main'
> submodules of any other modules used by the starting module. This
> protocol is implemented for `racket', but not yet for DrRacket.
>
> * Languages with separate read-time, configure-time, and run-time
> code can be defined in a single module, with the configure-time and
> read-time code in submodules.
>
> * A testing macro could collect test cases and put them into a
> separate `test' submodule', so that testing code is not run or even
> loaded when the module is used normally.
>
> * An improved `scribble/srcdoc' can expose documentation through a
> submodule instead of through re-expansion hacks.
>
> * If you want to export certain of a module's bindings only to when
> explicitly requested (i.e., not when the module is `require'd
> normally), you can export the bindings from a submodule, instead.
>
> When I first started talking about these problems last summer, I
> called the solution sketch "facets" or "modulets", but the design
> has evolved into "submodules".
>
>
> Nesting `module'
> ----------------
>
> Given the term "submodule", the first thing that you're likely to try
> will work as expected:
>
> #lang racket/base
>
> (module zoo racket/base
> (provide tiger)
> (define tiger "Tony"))
>
> (require 'zoo)
>
> tiger
>
> Within `module', a module path of the form `(quote id)' refers to the
> submodule `id', if any. If there's no such submodule, then `(quote
> id)' refers to an interactively declared module, as before.
>
> Submodules can be nested. To access a submodule from outside the
> enclosing module, use the `submod' module path form:
>
> #lang racket/base
>
> (module zoo racket/base
> (module monkey-house racket/base
> (provide monkey)
> (define monkey "Curious George"))
> (displayln "Ticket, please"))
>
> (require (submod 'zoo monkey-house))
>
> monkey
>
> The 'zoo module path above is really a shorthand for `(submod "."
> zoo)', where "." means the enclosing module and `zoo' is its
> submodule. You could write `(submod "." zoo monkey-house)' in
> place of `(submod 'zoo monkey-house)'.
>
> Note that `zoo' and `monkey-house' are not bound as identifiers in the
> module above --- just like `module' doesn't add any top-level
> bindings. The namespace of modules remains separate from the namespace
> of variables and syntax. Along those lines, submodules are not
> explicitly exported, because they are implicitly public.
>
> When you run the above program, "Ticket, please" is *not* displayed.
> Unless a module `require's a submodule, instantiating the module does
> not instantiate the submodule. Similarly, instantiating a submodule
> does not imply instantiating its enclosing module.
>
> Furthermore, if you compile the above example to bytecode and run it,
> the bytecode for `zoo' is not loaded. Only the bytecode for the
> top-level module and `monkey-house' is loaded.
>
>
> Nesting `module*'
> -----------------
>
> Submodules declared with `module' are declared locally while expanding
> a module body, which means that the submodules can be `require'd
> afterward by the enclosing module. This ordering means, however, that
> the submodule cannot `require' the enclosing module. The submodule
> also sees no bindings of the enclosing module; it starts with an empty
> lexical context.
>
> The `module*' form is like `module', but it can be used only for
> submodules, and it defers the submodule's expansion until after the
> enclosing module is otherwise expanded. As a result, a submodule using
> `module*' can `require' its enclosing module, while the enclosing
> module cannot require the submodule.
>
> A ".." in a `submod' form goes up the submodule hierarchy, so that
> `(submod "." "..")' is a reference to the enclosing module:
>
> #lang racket/base
>
> (module aquarium racket/base
> (provide fish)
> (define fish '(1 2))
>
> (module* book racket/base
> (require (submod "." ".."))
> (append fish '(red blue))))
>
> (require (submod 'aquarium book))
>
> Instead of `require'ing its enclosing module, a `module*' form can use
> `#f' as its language, in which case its lexical context starts with
> all of the bindings of the enclosing module (implicitly imported)
> instead of with an empty lexical context. As a result, the submodule
> can access bindings of the enclosing module that are not exported:
>
> #lang racket/base
>
> (module aquarium racket/base
> (define fish '(1 2))
>
> (module* book #f
> (append fish '(red blue))))
>
> (require (submod 'aquarium book))
>
> A common use of `module*' is likely to be with `main', since `racket'
> will load a `main' submodule (after `require'ing its enclosing module)
> for a module named on its command line. For example, if you run this
> program via `racket':
>
> #lang racket/base
>
> (provide fish)
> (define fish '(1 2))
>
> (module* main #f
> (unless (apply < fish)
> (error "fish are not sorted")))
>
> then you get a "fish are not sorted" error, but if you `require' the
> file into another program, you get a `fish' binding with no error.
>
>
> The new `#lang'
> ---------------
>
> The `#lang' reader form was previously defined as a shorthand for
> `#reader' where the name after the `#lang' is mangled by adding
> "/lang/reader". With submodules, `#lang' first tries using the name
> as-is and checking for a `reader' submodule; if it is found, then the
> submodule is used instead of mangling the name with "/lang/reader",
> otherwise it falls back to the old behavior.
>
> So, if you want to define an `ocean' language that is `racket/base'
> plus `fish', it's enough to install the following module as "main.rkt"
> in an "ocean" collection:
>
> #lang racket/base
>
> (provide (all-from-out racket/base)
> fish)
> (define fish '(1 2 3))
>
> (module reader syntax/module-reader
> #:language 'ocean)
>
>
> Backwards Incompatibility
> -------------------------
>
> The biggest incompatibility is that `resolved-module-path-name' can
> return a list when the module path refers to a submodule, in addition
> to the old path and symbol results. Most code that calls
> `resolved-module-path-name' will have to be updated.
>
> The `submod' form is a new primitive module-path form, so module name
> resolvers also must be updated. Finally, a load/use-compiled handler
> must accept a list as the expected-module name, which usually
> indicates that a submodule is being loaded; the list can start with
> `#f' to indicate that the module should only be loaded if it can be
> loaded independently from bytecode (i.e., without triggering the
> declaration of any other submodule, which means not loading from
> source). Furthermore, when a submodule is requested, no error should
> be raised if the enclosing module is unavailable, which allows
> speculative checking for submodule declarations.
>
> The bytecode format has changed, and the `mod' structure type from
> `compiler/zo-parse' has two new fields: one for "pre" submodules
> (i.e., those declared with `module') and one for "post" submodules
> (i.e., those declared with `module*'). Any code that uses
> `compiler/zo-parse' will have to change.
>
> If you compile a `module' form and it has submodules, then when you
> write the bytecode, all of the modules are written together. If the
> `module' is not inside a larger top-level sequence, then the printed
> form starts with a table that can be used to find any individual
> submodule, which is how independent loading of submodules works. If
> you just `read' the table in, though, it returns a compiled-module
> value that contains submodules, and `eval'ing the compiled module
> declares all the submodules, too. This protocol makes lots of
> `compile' and `eval' code work without modification. The
> `get-module-code' function from `syntax/modcode', meanwhile, gives you
> more control, along with functions like module-compiled-submodules' to
> get or adjust the submodule list in a compiled-module value.
>
>
> Design Issues
> -------------
>
> The `submod' syntax --- especially "." and ".." --- is arbitrary. The
> `submod' name isn't great, but I like it the best among the options
> that I tried. I'm not sure whether the association of "." and ".."
> to filesystem paths is helpfully mnemonic or unhelpfully
> confusing. The handling of `quote' paths within a module is also
> arbitrary, but it's intended to smooth the connection between the top
> level and a module body.
>
> Overloading `module' for submodules is questionable; again, though, I
> like how it roughly matches interactive evaluation. For the
> post-submodule form, then, `module*' seems like the obvious
> choice.
>
> As things stand, the ugly pattern `(module* main #f ...)' would be
> common. Probably we should have a macro that expands to `(module* main
> #f ...)'. Should the macro be called `main'?
>
> I haven't tried to build a test-collecting macro or a
> `scribble/srcdoc' replacement. I think they will work with this
> submodule design, but I can't be sure until we try it.
>
> _________________________
> Racket Developers list:
> http://lists.racket-lang.org/dev
--
Jay McCarthy <jay at cs.byu.edu>
Assistant Professor / Brigham Young University
http://faculty.cs.byu.edu/~jay
"The glory of God is Intelligence" - D&C 93