[plt-dev] Exponential expansion

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Wed Feb 17 09:13:06 EST 2010

I've changed some quadratic-time parts of internal-definition expansion
to linear-time. It's very difficult to eliminate all of the quadratic
parts of environment and syntax-object manipulation, though; I've tried
before.

The changes at least reduce the constant on the quadratic for
internal-definition expansion, so the linear part dominates for a while
longer in your `expression-module' example:

 Old:
  125   cpu time: 41 real time: 42 gc time: 0
  250   cpu time: 112 real time: 116 gc time: 25
  500   cpu time: 258 real time: 263 gc time: 19
  1000  cpu time: 891 real time: 905 gc time: 44
  2000  cpu time: 3032 real time: 3073 gc time: 64
  4000  cpu time: 12922 real time: 13476 gc time: 113

 New:
  125   cpu time: 74 real time: 74 gc time: 36
  250   cpu time: 46 real time: 46 gc time: 0
  500   cpu time: 101 real time: 105 gc time: 17
  1000  cpu time: 223 real time: 236 gc time: 49
  2000  cpu time: 456 real time: 474 gc time: 69
  4000  cpu time: 1089 real time: 1122 gc time: 123
  8000  cpu time: 2939 real time: 3015 gc time: 333
  16000 cpu time: 9006 real time: 9235 gc time: 709
  32000 cpu time: 31223 real time: 32117 gc time: 1292

At Tue, 16 Feb 2010 19:25:30 -0700, Matthew Flatt wrote:
> Yes, it's roughly quadratic, since each discovered internal `define'
> adds an item to an environment represented as a linked list (and
> recognizing the next `define' means walking through the environment to
> check for shadowing bindings). I'll try to improve it.
> 
> At Sun, 14 Feb 2010 19:41:21 -0500, Carl Eastlund wrote:
> > Internal definition contexts (e.g. the body of let or lambda) appear
> > to expand in exponential time based on the number of definitions.  The
> > exponential is very slow -- the exponent is probably not linear, but
> > based on some root of the number of definitions -- but given a large
> > enough program it does explode.
> > 
> > Here is code I used to benchmark definitions in a module versus
> > definitions in a let expression (also inside a module, in case it
> > makes a difference):
> > 
> > --------------------------------------------------
> > 
> > #lang scheme
> > 
> > (define (definition-module n)
> >   #`(module M scheme #,(definitions n)))
> > 
> > (define (expression-module n)
> >   #`(module M scheme #,(expressions n)))
> > 
> > (define (expressions n)
> >   #`(let () #,(definitions n) (void)))
> > 
> > (define (definitions n)
> >   #`(begin #,@(build-list n (lambda (i) #`(define #,(gensym) (void))))))
> > 
> > (define (show term)
> >   (for ([i (in-range 3)])
> >     (time (expand-syntax term))))
> > 
> > --------------------------------------------------
> > 
> > And here are the timing results of expanding modules and expressions
> > with 1,000 through 10,000 definitions, with three trials at each
> > increment of 1,000:
> > 
> > > (show (definition-module 1000))
> > cpu time: 409 real time: 416 gc time: 119
> > cpu time: 279 real time: 283 gc time: 0
> > cpu time: 366 real time: 374 gc time: 71
> > > (show (definition-module 2000))
> > cpu time: 690 real time: 702 gc time: 121
> > cpu time: 723 real time: 734 gc time: 147
> > cpu time: 723 real time: 738 gc time: 152
> > > (show (definition-module 3000))
> > cpu time: 1148 real time: 1171 gc time: 297
> > cpu time: 986 real time: 1001 gc time: 154
> > cpu time: 1138 real time: 1167 gc time: 278
> > > (show (definition-module 4000))
> > cpu time: 1430 real time: 1453 gc time: 319
> > cpu time: 1441 real time: 1467 gc time: 310
> > cpu time: 1682 real time: 1710 gc time: 531
> > > (show (definition-module 5000))
> > cpu time: 3089 real time: 3128 gc time: 1693
> > cpu time: 2920 real time: 2960 gc time: 1498
> > cpu time: 1899 real time: 1932 gc time: 498
> > > (show (definition-module 6000))
> > cpu time: 2220 real time: 2261 gc time: 514
> > cpu time: 2216 real time: 2256 gc time: 489
> > cpu time: 2182 real time: 2235 gc time: 461
> > > (show (definition-module 7000))
> > cpu time: 2702 real time: 2748 gc time: 670
> > cpu time: 5213 real time: 5301 gc time: 3207
> > cpu time: 2500 real time: 2558 gc time: 480
> > > (show (definition-module 8000))
> > cpu time: 3128 real time: 3182 gc time: 791
> > cpu time: 3042 real time: 3105 gc time: 721
> > cpu time: 3073 real time: 3128 gc time: 728
> > > (show (definition-module 9000))
> > cpu time: 5864 real time: 5937 gc time: 3329
> > cpu time: 3454 real time: 3518 gc time: 872
> > cpu time: 3267 real time: 3324 gc time: 690
> > > (show (definition-module 10000))
> > cpu time: 3884 real time: 3952 gc time: 1023
> > cpu time: 6234 real time: 6322 gc time: 3429
> > cpu time: 3685 real time: 3766 gc time: 800
> > 
> > > (show (expression-module 1000))
> > cpu time: 1152 real time: 1160 gc time: 0
> > cpu time: 1211 real time: 1223 gc time: 82
> > cpu time: 1166 real time: 1177 gc time: 0
> > > (show (expression-module 2000))
> > cpu time: 3895 real time: 3919 gc time: 108
> > cpu time: 4185 real time: 4215 gc time: 152
> > cpu time: 4430 real time: 4455 gc time: 157
> > > (show (expression-module 3000))
> > cpu time: 9354 real time: 9416 gc time: 185
> > cpu time: 8699 real time: 8750 gc time: 160
> > cpu time: 8860 real time: 8913 gc time: 149
> > > (show (expression-module 4000))
> > cpu time: 19506 real time: 19758 gc time: 303
> > cpu time: 18091 real time: 18254 gc time: 301
> > cpu time: 17669 real time: 17842 gc time: 175
> > > (show (expression-module 5000))
> > cpu time: 30103 real time: 30369 gc time: 1756
> > cpu time: 31301 real time: 31784 gc time: 1376
> > cpu time: 28922 real time: 29213 gc time: 279
> > > (show (expression-module 6000))
> > cpu time: 50572 real time: 50931 gc time: 401
> > cpu time: 42670 real time: 43028 gc time: 275
> > cpu time: 42491 real time: 42994 gc time: 320
> > > (show (expression-module 7000))
> > cpu time: 65161 real time: 65775 gc time: 427
> > cpu time: 72382 real time: 72834 gc time: 407
> > cpu time: 60682 real time: 61430 gc time: 463
> > > (show (expression-module 8000))
> > cpu time: 98473 real time: 99281 gc time: 3186
> > cpu time: 99311 real time: 100036 gc time: 372
> > cpu time: 106628 real time: 107104 gc time: 401
> > > (show (expression-module 9000))
> > cpu time: 120579 real time: 121512 gc time: 572
> > cpu time: 127369 real time: 128387 gc time: 418
> > cpu time: 125163 real time: 126157 gc time: 575
> > > (show (expression-module 10000))
> > cpu time: 157635 real time: 158839 gc time: 619
> > cpu time: 173710 real time: 174952 gc time: 3593
> > cpu time: 178207 real time: 179627 gc time: 553
> > 
> > The module-level definitions expand in linear time, taking 3-400
> > milliseconds per definition and completing a 10,000 definition module
> > in around 3 seconds.  The internal definitions start exploding
> > quickly, doubling every 1,000 definitions or so; this slows down
> > somewhat, but the final trials of 10,000 definitions take nearly 3
> > minutes instead of 3 seconds.
> > 
> > Practical examples of internal definitions, such as units, classes,
> > and (most important to me) Modular ACL2 components, explode even
> > faster than this.  This may be the internal definition explosion
> > multiplied by greater constant factors, or it may be that certain uses
> > of local-expand add another source of exponential explosion.  One way
> > or another, so long as internal definition contexts can take
> > exponential time, language features based on them can't do any better
> > in the general case.
> > 
> > Is this an inherent drawback of internal definition expansion?  Or is
> > this a bug we can eliminate?
> > 
> > Carl Eastlund
> > _________________________________________________
> >   For list-related administrative tasks:
> >   http://list.cs.brown.edu/mailman/listinfo/plt-dev
> _________________________________________________
>   For list-related administrative tasks:
>   http://list.cs.brown.edu/mailman/listinfo/plt-dev


Posted on the dev mailing list.