[plt-scheme] to define, or to let

From: Anton van Straaten (anton at appsolutions.com)
Date: Tue Mar 23 04:34:01 EST 2004

Paul Graunke wrote:
> I've been trying to ignore this thread, so I apologize if I'm repeating
> something already said.  Maybe I should have kept ignoring it since
> my comments are probably not going to change anything anyway.

Heh heh, suckered you in.  Welcome.

> I just want to say that I do not understand why anyone would want
> to program in a language without a well-defined meaning for each
> program.

I ended up discussing this some offline, and so I can cut and paste
something which addresses this.

First, correct programs still have well defined meanings, so really, you're
talking about an issue of error detection.  Only incorrect programs will
produce wrong answers or random stuff.  By "correct", I mean that relying on
order dependencies in constructs which have unspecified eval order is
obviously incorrect.  Anything else is correct, re this issue.  Strangely,
regardless of the theoretical horror of such a concept, in practice it's not
a big deal - C and C++ programmers have done this for decades, and it's
really easy to avoid writing incorrect code in this respect.

OTOH, if you fix evaluation order throughout the language to something
useful, there's nothing but your own voluntary restraint to prevent you from
taking advantage of it, which makes it possible to write valid code which is
inherently fragile, and not necessarily obviously so - because side effect
ordering in any expression becomes acceptable, so you can't tell when they
belong and when they don't.

Code which performs order-dependent side effects within function
applications and other expressions that are not specifically designed for
sequencing is inherently fragile, because there are two sets of essentially
unrelated constraints that have to match up: the order of evaluation of the
expression, and the required order of side effects.  Someone making a minor
rearrangement to such an expression can easily break it.  That's fragile.  I
probably don't need to explain this to people who understand the dangers of
side effects.

But here's the real point, which leads to the answer to your question: it's
not so much that I specifically want to program in the sort of language you
describe, but rather that I want to be able to express when I mean to use
eval order dependencies, and when I don't.  If you take that capability away
from me, by fixing eval order throughout the language, then I'll answer your
question by saying that I see a tradeoff, between being able to express what
I want to express and dealing with the practically trivial issue of avoiding
the resulting potential for ambiguity, vs. having useful information removed
from the source, not being able to express what I want to express, and
having certain kinds of order dependency bugs actually become harder to
identify (due to lack of expression of intent).

Before arguing with any of the above, try my thought experiment:

Imagine a language which has duplicate sets of constructs for everything,
including function application, binding, etc.  One set of constructs
guarantees left-to-right evaluation; the other also guarantees unambiguous
evaluation, but has an order of evaluation that makes it extremely difficult
to abuse the constructs to achieve desirable side effect ordering (perhaps
using a formula based on things like function name, number of arguments,
etc. - obviously not realistic, just for the sake of the point.)

Questions:

1.  This language now has a well-defined meaning for every expression, even
incorrect ones (since you may still incorrectly try to use the weird-order
constructs to try to sequence effects, which is almost certainly incorrect).
Aside from possibly not seeing the need for the extra set of constructs, do
you see any other problems with such a language?

2.  Would you simply always use the L-to-R constructs in this language, even
in cases where you knew order was irrelevant, and avoid the other
constructs?

3.  Even if your answer to #2 is "Yes", can you see why someone might want
to use both?

The point of the above should be reasonably clear.  It's not that
"unspecified" eval order is important per se - as I've said, what I'm
interested in is the ability to express the information contained in both
sets of constructs.  In the absence of a language such as the above, I find
R5RS + real Scheme implementations perfectly reasonable, as long as it's
understood that function application, and LET and friends, are not supposed
to be used to sequence effects, even if the implementation happens to
support that.  In practice, I find that the natural documenting of intent by
use of the appropriate constructs is sufficient to easily avoid
incorrectness in this area, especially when looking at other people's code.
That's actually not the case in Java, where such distinctions are hidden by
the eval order.

> When crap like that happens, one can blame the programer.  Someone
> I know told me he always writes his code in ANF to avoid this problem.
> Sure, one can claim that a programmer should write only code that
> is independent of evaluation order (baring time and space consumption).
> The question I have then is, how can I tell?  Did I write my code
> so that it is independent of evaluation order, or did I mess up?

I'm talking about using sequencing constructs when that's what you need, and
not using them when you don't need them.  If you depend on sequence in an
expression which does not guarantee a useful sequence, that's an error.

If you use side effects in such expressions, it should be obvious by simple
inspection that they don't have any complicated consequences.  This is good
practice anyway, and it's easy to do in Scheme, especially if you're writing
fairly functional code.  C/C++ programmers do it all the time, too.

> If the compiler writer cannot write a static analysis that can tell,
> how am I supposed to know?

Not every property of a program can be determined by static analysis.  Hey,
we're talking about a dynamically-typed language!

On this issue, if you don't encode the information about the intent related
to sequencing, then you end up with a situation in which *no* analysis -
static or dynamic - can tell when there's a sequencing bug.  As I argued in
an earlier post, what you've done is moved this question outside of the
domain that the language can say things about - but you haven't prevented
programs from expressing all sequence dependency errors.

OTOH, if you document the intent, you can at the very least perform a
dynamic analysis.  Besides, documented intent makes potential problems easy
to avoid.

> An alternative to analysis is methodical
> construction---i.e. write your programs in ANF or CPS so they
> pin down the order explicitly.

Pinning the order down explicitly where it matters is the issue, and being
able to tell the difference by simple inspection.

> If one argues that this is the
> only reasonable way (or the most reasonable way) to write code,
> then design a subset of scheme that accepts *only* programs in
> that form.

So you're still talking about pinning order down explicitly through the
whole program?  That would miss my point.

> *Any* time a programmer has to maintain a complex
> invariant that depends on code scattered across the source base,
> it is always helpful to have the *system* enforce, check, or
> maintain the invariant for you.  Just think about memory management.

The analogy that makes sense to me is types.  Many of the arguments you and
others have made can be applied to types - in particular, the static
analysis point.  DrScheme lets you write code that will blow up at runtime
with type errors, when it's possible to prevent that.  Yet we use it anyway,
despite the availability of ML and Haskell.

> Sigh,
>
> Paul "don't know how to change people" Graunke

You have to address some issues first, you can't simply wave a magic "issues
which I haven't addressed should go away" wand.  We're really talking about
tradeoffs and choices related to imperfect tools.  And it's not even clear
that your perfect tool would satisfy me, and possibly vice versa.

Anton



Posted on the users mailing list.