[racket] DSLs and complexity

From: Matthias Felleisen (matthias at ccs.neu.edu)
Date: Fri Jun 21 13:35:21 EDT 2013

On Jun 21, 2013, at 8:26 AM, John Gateley wrote:

> Subject for discussion:
> http://firstround.com/article/The-one-cost-engineers-and-product-managers-dont-consider#
> Interesting sentence in the middle:
> Consider DSLs, abstractions and the attraction to being the one to build a framework that gets leveraged for years. 
> I think Racket is a different target: education vs. engineering (is this true?). As a software
> engineer, I really agree with the article. Complexity is almost always a terrible thing,
> whether it is a DSL, a complex implementation of a simple interface, or just the
> one additional thing requested by product management that didn't fit. 
> For Racket: are DSLs a source of complexity? Or would you argue that they reduce the
> complexity normally introduced with DSLs?


this article's claim concerning abstractions and DLS is vacuously true so it's also devoid of any information. I grant the maintenance-construction cost ratio; I teach to this slogan -- starting with How to Design Programs through HtD Components and HtD Systems. It is the guiding principle. 

If we wanted to turn this person's essay into a well-founded statement that helps engineers, we would first have to clarify what complexity is or what simplicity is. No, "I know it when I see" it won't work here. To clarify, I am sure that many people will say that C is a simpler language than Racket. If we follow the article's recommendation then, we should use C. But as you know, C lacks safety and memory safety and these gaps seriously impeded software development and maintenance. The lack of safety means that you never know whether the output of a C program is serious or whether it's some random bits from some place in memory interpreted as, say, an int. The lack of memory safety in particular destroys modularity. Every dyn mem handed over from one component to another must be tracked and accounted for. 

Not every language that is more complex than C will reduce the cost of software construction and maintenance. To wit, C++ started out as a more complex variant of C and to this day it is 'sold' that way even though it moved away from its roots over the past 10 years. Its complexities introduce seriously deeper safety problems, which measurably  impact software construction and maintenance problems. IBM believes that this cost is a factor of 3x to 5x when compared to Java, another language that is definitely more complex than C. The San Francisco project under Kathy Bohrer, a Rice grad from around your time, ran the project in C++, switched to Java, and convinced a lot of people at IBM to measure this cost. A few years later the company switched all software to Java for these reasons. This is not to say that Java is good; but it does say that they actually measured cost, compared, and went from simple and complex languages to other complex languages. If I were a senior software dev manager at a company, I would pay attention to someone who measures and compares instead of someone who writes content-free polemics that actually sound correct. 

The toggles examples from the article is apt here. The lack of safety in C means that there is no isolation and every line in a system may potentially affect the behavior of every other line -- just like the toggles/switches mentioned. But now imagine, the guy had first built a box around the first two switches so that only their relevant behavior is visible -- say two states -- and then added a third one. In that case, the interactions would be fine. 

LESSON 1: simplicity by itself is not an advantage in software development. 

LESSON 2: complexity comes in many flavors, some good, some bad. 

Now let's move on to DSLs. A DSL, like any abstraction, helps you reduce cost if it is well designed and meets your needs. If you don't have a need in existing code for an abstraction, don't build it. If you do have a need, 
 -- understand the abstraction mechanisms of your language 
 -- study the concrete cases of repetition, extensive verbiage to say things 
 -- use the abstraction mechanisms of your language to create an abstraction that removes your repetitions, extensive verbiage. 

If you work in Java, you don't have good abstraction mechanisms to eliminate domain-specific verbiage from your systems. My recent reading experience with three books on industrial DSL building tools firmly convinced me that current practice can easily flip into counter-productive architecture acrobatics. If you build these DSLs, you may make the system more costly. If you work in Racket, you have great tools for building internal DSLs and you can smoothly integrate modules written in different DSLs. As I said, this thoughts apply to any abstraction mechanism, internal DSLs are just the most powerful form of abstraction. 

LESSON 3: if you have bad tools, building an abstraction may increase the cost of building a system 

LESSON 4: with good tools, you're likely to reduce the cost, but ill-trained programmers can and do mess up

Let me finally address internal complexity vs external complexity. When you construct a language like Java, you are actually building an extremely complex system. But, the Java community succeeded beyond imagination and possibly beyond justification with making their language appear well-layered, well-structured and simple. Guy Steele has stated this as "you have to make things sufficiently complex to make them appear simple" and Matthew has stated similar experiences with the design of Racket again and again. The key is that very few Java programmers experience the complexity that the Java box hides; most have to cope with the complexities of the language. Here is an example that is more straightforward. AMPL is an external DSL for writing down mathematical programs (linear, integer, binary, graphs, networks). If you are an applied math person, the only difference between AMPL programs and paper and pencil is that the former uses ASCII. (Perhaps they use unicode these days.) It is definitely at the right level and reduces the huge cost of maintaining mathematical programs in, say, Fortran, a vastly simpler language than AMPL -- as far as the internals are concerned. AMPL is after all a complex PL, with niffty parsers and abstraction capabilities for plugging in all kinds of solvers. But it neatly separates the solver from the problem statement, for example, which while internally complex, makes it externally simple. 

LESSON 5: Do not confuse external and internal complexity. A proper separation may vastly decrease the cost of creating programs. 

Very last thought on 'complexity.' All of the above assumes that the time of programmers is more costly than what it takes to run the computation. I know of cases where this is not true. In that case, you might be better off sacrificing programmers on the altar of machine time. The cases I know are extremely rare and involve expensive programmers who understand a lot of continuous mathematics and who work with computers that are extremely expensive to use. 

;; --- 

As for Racket: yes, it is good at DSLs because we had to pay attention to building languages for our original educational goals. But for the last 10 years, the focus has widened to include a lot of industrial and quasi-industrial uses. It is no longer the TI Scheme of yore. 

Your milage may vary. -- Matthias

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20130621/d40f7f13/attachment.html>

Posted on the users mailing list.