[plt-scheme] Re: to define, or to let

From: Anton van Straaten (anton at appsolutions.com)
Date: Sat Apr 10 03:20:03 EDT 2004

>    > why is a fixed order of evaluation safer?
>
>    It's safer than an unspecified order, in the presence of side
>    effects.  If you have side effects in an expression with an
>    unspecified order of evaluation, then the expression tends to have
>    multiple possible meanings, i.e. its meaning is ambiguous.  That's
>    the semantic ambiguity I mentioned.  It means you can't rely on the
>    program's behavior, hence it is unsafe.
>
> Anton, if that's true, then Scheme was an unsafe programmers before
> MzScheme.  Do folks believe that?

No.  R5RS explicitly allows for implementations to fix the order of
evaluation, and it's expected that implementations will do that (although
they don't have to).  Most implementations of Scheme do fix an order of
evaluation, and document that order.  Note that this doesn't mean you should
rely on that order in function application and in expressions like LET and
LETREC; doing so is not standard Scheme.

> I thought unspecified order just meant that in an expression
> (f x), that side-effects given by evaluating f and evaluating
> x shouldn't collide.

As a programmer, writing code to execute in unspecified order, what you've
described is a constraint that must be respected, if you want your code to
have repeatable behavior.  The problem is that in general, determining
whether or not a program respects this constraint is not so easy, and in
particular it's not possible to do statically.

> I think we're in agreement, Anton.  Your semantic ambiguity is real
> but bites the interpreter writers much harder than the programmers.

Not sure it can be summarized so briefly.  The alleged advantage to
programmers of a fixed order of evaluation is that even if they make a
mistake and have an order dependency that they didn't intend, the program
will still behave consistently.  If that consistent behavior seems to
conform to the program's desired behavior, then the order dependency may be
allowed to remain, unrecognized.  This is the real world consequence of the
program having an unambiguous semantics - it's good because you get
consistent behavior, but it's not all good: unfortunately, such a situation
pretty much qualifies as a latent bug, or at best, fragility in the program.

 From a semanticist's point of view, they don't care, because with fixed
evaluation order the program has a well-defined behavior, even if it
achieves that behavior partly "by accident".  Semantics is not usually
concerned with the intent of the programmer, only with what the programmer
actually ended up encoding in the program.

That's the precise reason there's some controversy around the issue of order
of evaluation in general (not just letrec): if you take a narrow view of a
program, that it only exists to communicate an algorithm to a computer, then
programs that are correct by accident are fine, and you can happily analyze
their unambiguous semantics.  However, if you're concerned with a broader
notion of correctness, as engineers often are (or an enlightened manager),
then programs that are correct by accident can be very disturbing.  In the
real world, such bugs can be very costly, in all senses of the word.

Fixing order of evaluation throughout a language introduces a classic case
of "moral hazard" (http://en.wikipedia.org/wiki/Moral_hazard): if
programmers no longer have to pay attention to whether their code depends on
evaluation order, chances are many of them won't.  If the program seems to
work, it's good enough.  Programmers may end up with incorrect mental models
of the functioning of their programs - for example, they might not realize
that in an expression like (+ (f1) (f2)), that it only works as expected
because of the interaction between side effects performed by the two
functions.

One way to guard against this, even in a language with fixed order of
evaluation, is to require that programmers use special constructs to
indicate that code is depending on order of evaluation (like let*, letrec*,
and 'begin'); the rest of the time, they use constructs which indicate that
evaluation order is not depended on.

Some semanticists are puzzled by this, because it effectively introduces
into programs, assertions that are statically uncheckable.  The above
example wouldn't necessarily be directly "caught" by such an assertion.
However, what the assertion does do is communicate and encode the
programmer's intent about order of evaluation, which in practice, serves as
an effective way to (a) remind programmers not to depend on order of
evaluation in inappropriate contexts; (b) communicate a programmer's intent
to their future selves, and to other programmers, who may notice violations
and question them;  and (c) make it possible for a tool to run tests for
inappropriate order dependencies.

A semanticist might object that a tool to check this property is impractical
in theory, because you have to test the program with every possible order of
evaluation for every argument in every function call in the program.  In
practice, though, such bugs tend to show up very quickly, so practical
testing of this issue is actually very viable.  C and C++ programmers have
decades of experience with this issue, since both languages have an
unspecified order of evaluation in function application and (for the most
part) in expressions.

It's quite obvious that the techniques I've described are still not a
perfect solution to the problem of unintended order dependencies.  However,
ceteris paribus, it cannot be argued that *not* checking for such
dependencies is safer than making an attempt to check for them.  And you
can't check for them unless you've indicated where they should be checked
for.  Logic and reason are entirely on the side of those who want to be able
to make this assertion.

It should also be kept in mind that the sequencing of side effects are
fundamentally trickier to reason about than referentially transparent code.
In functionally-oriented programs in particular, you want to be able to
identify places which depend on side effect sequences.  Saying that
throughout a program, you can depend on side effects wherever you like,
without any indication, seems like a counterproductive decision for a
functionally-oriented language.

>    However, if you're going to have a functionally-oriented language,
>    and you want to use it functionally, then the ability to
>    communicate which bits of your program are intended to be
>    functional (in the side-effect-free sense) and which aren't, is
>    useful.  That doesn't mean you can't have a fixed evaluation order,
>    but it does mean you need constructs that are explicitly intended
>    for code that doesn't rely on evaluation order.  Without that,
>    you're forced to write code that says more than you mean to say.
>
> That's subtle.

Yes, apparently it is!  :)

> Should we really ask, "What's the complete meaning of
> this program I just wrote?"  If so, the ambiguous R5RS DS is a
> problem.

The ambiguous R5RS DS is better described as incomplete, by design.
Specifically in this case, it doesn't provide definitions for 'permute' and
'unpermute'.  All you need to do is provide a suitable definition for these
functions, and you can have unambiguous semantics.  However, note that this
should not be taken as a license to use function application, LET and LETREC
as a way to sequence side effects.  That's not their purpose, and using them
for that purpose puts the user in the same category as someone who drives
less safely because of having a good auto insurance policy.

> It seems better to ask, "What's does my program mean to me?"
You have to ask both "what's the complete meaning of this program I just
wrote?" as well as "what does my program mean to me?", and ensure that the
two match up.  The text of a program encodes meaning intended for the
computer as well as for the humans who wrote it and work with it.

> Code-reusability is a huge problem, and indeed, MF's crew has
> mathematical ideas about this as well...

This is more about reliability than reusability, I think, but PLT is
interested in all those good things, as am I.

Anton



Posted on the users mailing list.