[racket] how to improve performance by streamlining imports?

From: Matthew Butterick (mb at mbtype.com)
Date: Fri Feb 21 14:38:11 EST 2014

> Rewriting to avoid a dependency does work, but usually only if you can
> avoid some large subsystem (such as the contract system) completely.


Contracts are a good example. Here's the vexing pattern that I've run into frequently:

first.rkt
(require racket/contract)
(provide (contract-out [first-function (foo? . -> . bar?)])
(define first-function ...)

second.rkt
(require racket/contract)
(provide (contract-out [second-function (bar? . -> . zam?)])
(define second-function ...)

main.rkt
(require "first.rkt" "second.rkt")
(define hoping-for-a-foo (get-unreliable-input))
(second-function (first-function hoping-for-a-foo))

The output contract on first-function guarantees a bar-thing. But second-function is still going to run the bar? contract on the input, needlessly I think.

Moreover, even if the bar? contract is cheap, I still have to incur the overhead of racket/contract to be able to use second-function at all, even though I don't need it in this context.

So I wonder if it makes sense to revise the public interface as follows, to allow circumvention of contracts in situations where you have confidence in the data (because, e.g., it's being validated elsewhere):

second-fast.rkt
(provide second-function)
(define second-function ...)

second.rkt
(require racket/contract "second-fast.rkt")
(provide (contract-out [second-function (bar? . -> . zam?)])
         (rename-out [second-function second-function-without-contract]))





> If it happens that you're creating a fresh namespace for every
> compilation, then you can probably decrease compilation time by using a
> single namespace or by attaching modules to the compilation namespace.
> My guess is that you're not using separate namespaces, though.

Yes, I am using separate namespaces to render the templates (because I am using eval).
Yes, I am using namespace-attach-module to speed that up, per an earlier suggestion from Robby. 
That was the biggest performance improvement so far. Like so —

(require [list of modules])
(define original-ns (current-namespace))

(parameterize ([current-namespace (make-base-empty-namespace)])
    (namespace-attach-module original-ns [list of modules])
    (eval [the file]))





> Can you provide a small example that illustrates the problem? Seeing a
> concrete example might trigger ideas that wouldn't come to mind
> otherwise.

I'm thinking that getting my #lang to run faster should be the main target, because every file being rendered inside eval uses the #lang.

I'll see if I can extract a characteristic example.







On Feb 21, 2014, at 9:06 AM, Matthew Flatt <mflatt at cs.utah.edu> wrote:

> I don't think that an importer-in-chief module is likely to help, and I
> doubt that smaller versus larger modules will matter much.
> 
> Rewriting to avoid a dependency does work, but usually only if you can
> avoid some large subsystem (such as the contract system) completely.
> 
> I doubt that `include` will help.
> 
> If it happens that you're creating a fresh namespace for every
> compilation, then you can probably decrease compilation time by using a
> single namespace or by attaching modules to the compilation namespace.
> My guess is that you're not using separate namespaces, though.
> 
> Can you provide a small example that illustrates the problem? Seeing a
> concrete example might trigger ideas that wouldn't come to mind
> otherwise.
> 
> At Thu, 20 Feb 2014 19:40:50 -0800, Matthew Butterick wrote:
>> Are there any preferred practices in terms of optimizing one's importing 
>> technique for the preferences of the Racket compiler? My project involves a 
>> lot of real-time recompiling of code (= dynamic evaluation of web templates), 
>> so the costs of the initial imports on each recompile have become the gating 
>> factor to better performance. Once everything is queued up, the data 
>> processing itself is pretty quick.
>> 
>> Sample questions, though maybe they are the wrong ones to ask:
>> 
>> Rather than having a many-to-many importing relationship among modules, is it 
>> better to designate one module as the importer-in-chief and have it transmit 
>> the imports to the other modules, using (provide (all-from-out ...))?
>> 
>> Should code be organized in smaller modules (more compiling events, but less 
>> code to compile) or larger modules? My impression based on using it is that 
>> the greatest expense is firing up the compiler, so more code per module is 
>> more efficient.
>> 
>> How terrible is it to rewrite code to avoid a dependency? It's always better 
>> for performance, but it also defeats the purpose of having modules with common 
>> definitions. This surfaces also when breaking a module into submodules. The 
>> submodule divisions may make semantic sense, but if the submodules rely on 
>> each other, then importing any one submodule drags them all along anyhow. 
>> Perhaps one is better off just leaving all the code in one module, unless you 
>> can really create independent submodules.
>> 
>> Is 'include' a legitimate way to get around this — by writing the code once, 
>> but allowing it to be compiled within multiple modules?
>> ____________________
>>  Racket Users list:
>>  http://lists.racket-lang.org/users



Posted on the users mailing list.