[plt-scheme] Scheme and R

From: Neil Toronto (ntoronto at cs.byu.edu)
Date: Thu Mar 26 18:17:39 EDT 2009

Eli Barzilay wrote:
> On Mar 26, Neil Toronto wrote:
>> As a language it's rather weak and inconsistent. Sungwoo Park's
>> analysis is good, so I'll defer to him:
>>
>>      http://www.postech.ac.kr/~gla/paper/R-ism-dec-8.ppt
> 
> If anything, reading this made me appreciate R more than my previous
> vague impression.  Specifically, that criticism reads very obviously
> as an ML advocate criticising (PLT) Scheme.

I got that impression as well, though R does have problems in the 
*areas* he points out. It rather reminds me of Wadler's critique of 
scheming, where the problems he pointed out were often problems for a 
different reason than he said.

> Even more specifically:
> 
> * We have a `void' value that is the common result of side-effect
>   functions[1] -- we even have (surprise) a `when' expression, and
>   one-sided `if's.  (It's not clear to me whether R implements this
>   using a void value or using something like (values) -- but the
>   choice between the two is irrelevant.)

In this case, he finally got to the real problem. It's not so much that 
"if" doesn't require an else branch or that a single-branch "if" returns 
NULL, but that the designers decided to make some uses of it "just work" 
by defining operations like this:

     > paste("Hello", NULL, "there", sep="|")
     [1] "Hello||there"

So string-appending NULL appends nothing. But addition to it returns this:

     > NULL + 1
     numeric(0)

where "numeric" creates a vector of zeros of the given length. The 
zero-length vector acts like a black hole in further math operations:

     > numeric(0) + 1
     numeric(0)

Vectors are repeated if one vector argument is longer than the other, 
and give a warning if one length isn't a multiple of the other:

     > c(1, 2) + c(10, 20, 30, 40)
     [1] 11 22 31 42
     > c(1, 2, 3) + c(10, 20, 30, 40)
     [1] 11 22 33 41
     Warning message:
     In c(1, 2, 3) + c(10, 20, 30, 40) :
       longer object length is not a multiple of shorter object length

(where "c" creates flat vectors) but the zero-length vector is treated 
specially with no warning or error at all:

     > numeric(0) + c(10, 20, 30, 40)
     numeric(0)

I can see a train of thought leading to this overly forgiving and 
inconsistent state of affairs, and I think it derailed as soon as it set 
out.

> * We have implicit boxing that allows `set!'.  It's debatable whether
>   this is better than forcing an explicit `box' in the code (as ML
>   does), but again, with modules turning these into local boxes, the
>   difference is almost cosmetic[2].
> 
> * Note that the "No Lexical Scoping" criticises the R repl in a way
>   that applies to any Scheme repl.  The only difference is the `set!'
>   implying a definition (which is not a good idea in any case).  Even
>   more importantly, it highlights the `let*' semantics of the OCaml
>   repl -- which is very problematic in itself (up to the point where I
>   just gave up on using the OCaml repl for anything more than simple
>   testing)[3].  The "Special Top Level?" slide is bogus for the same
>   reason.

I agree about the top-level/repl thing, and yes the slide is bogus.

I just realized why I missed a lot of the bogus arguments. It appears 
that Park attacked the semantics of R without understanding what they 
are. I think I missed that because I'm already familiar with R and have 
already seen the problem areas he points out, along with specific 
examples of the sweeping generalizations made on slide 26 (evidence of 
too many complex/special cases).

Back to lexical scope. To get around = creating bindings and how that 
makes it difficult to mutate outer scope variables, R has a special 
assignment operator <<- that operates on the nearest outer scope. It 
also creates bindings there if they don't already exist. It's like 
Python's "global" but for any level. Funky.

I agree with the rest of your analysis.

Neil


Posted on the users mailing list.