[plt-scheme] to define, or to let

From: Bradd W. Szonye (bradd+plt at szonye.com)
Date: Sun Mar 21 18:33:52 EST 2004

Bradd wrote:
>> In any function or expression, some operations require a specific
>> sequence, and some don't. For example, in (* (+ a b) (+ c d)), the
>> additions must finish before the multiplication, but it doesn't
>> matter which addition you perform first.
>> 
>> In my opinion, it's very useful to encode that information explicitly
>> in the code. It helps code reviewers and maintainers to understand
>> control flow in the function, which ultimately helps them evaluate
>> and modify the code. One of the things I like about Scheme is that it
>> supports this

Paul Schlie wrote:
> Understood, however unambiguous evaluation semantics don't preclude an
> implementation's ability to parallelize the evaluation of code which
> has been formally determined not to have interdependencies.

But that just trades one ambiguity for another. It makes the program
deterministic, but it also makes it impossible to write "order doesn't
matter here" in code, which makes life harder for maintainers and code
reviewers. Putting the same information in comments is inferior, because
comments get out of sync with code, and because the compiler can't use
them to optimize code generation.

> Relying on the code's author to give an implementation the liberty to
> evaluate code in any order deemed to be convenient regardless of
> potential ambiguities which may result, seems more like an license to
> produce a shoddy implementation without the necessity of analysis that
> would otherwise be required to guarantee unambiguous results, which
> every programming language should be specified to require, otherwise
> it's broad usefulness is likely questionable.

First, it's not just about compiler optimizations. The ability to encode
code flow information is useful to humans too.

Second, if the programmer encodes a program's sequencing requirements
incorrectly, it's a programming error, not a problem with the language
definition. It's no different from coding the arguments to APPEND in the
wrong order: both errors are difficult to detect with static analysis,
and both are programming errors, not language defects.

Third, the analysis necessary to optimize while still guaranteeing
deterministic results is very expensive. It's an undecidable problem at
best (if you have access to the whole program), and it's impossible at
worst (if you use separate compilation). Furthermore, a competent
programmer really should know in advance whether sequence matters. Why
throw that information away, and then try to recover it with undecidable
static analysis? That doesn't make sense.

Fourth, static analysis can't always determine whether order matters.
For example, consider (+ (read file1) (read file2)). An deterministic
optimizer can't reorganize that, because the subexpressions have side
effects. But a human can easily determine that sequence doesn't really
matter here. An optimizer just doesn't have enough information to make
the decision, even though a human could make it easily.

Fifth, you can eat your cake and have it too. If you do actually have a
static analyzer that can detect "unsafe" side effects, you can use it to
spot potentially unsafe side effects in non-sequential constructs. That
way, you can encode control-flow information directly in the program
and have confidence that there are no defects related to side-effect
ordering.

> With respect to the below code being "correct" in the presents of it's
> presently allowable ambiguous behavior is the "problem" with scheme's
> present definition; as it truly only enables the specification of
> ambiguous code ....

So what? Scheme also "allows" you to code infinite loops, off-by-one
errors, and many other bugs. If you misuse the tools that Scheme gives
you, it's a programming error, not a defect in the standard.

> with no redeeming benefits which couldn't be safely achieved though
> appropriate analysis as is now done in most modern compilers.

The ability to encode control-flow information in the the program itself
is a significant benefit, especially to humans who read the code.

> All in all, it seems quite reasonable to me to require that scheme's
> language specification not enable the specification of ambiguous code,
> especially when it is possible to achieve similar benefits
> unambiguously.

I personally dislike the ambiguity that arises from having no "sequence
doesn't matter" constructs. When one reads programs in that kind of
language, it's difficult to determine where sequence is important. I
believe that this conceptual ambiguity is much riskier than the kind of
ambiguity you're arguing against.
-- 
Bradd W. Szonye
http://www.szonye.com/bradd


Posted on the users mailing list.