[plt-scheme] Question about variable scope in lists

From: Joe Marshall (jrm at ccs.neu.edu)
Date: Mon Feb 28 13:58:37 EST 2005

A00453721 at itesm.mx writes:

>   For list-related administrative tasks:
>   http://list.cs.brown.edu/mailman/listinfo/plt-scheme
>
> Greetings. First of all, I would like to clarify that, while I am a CS student,
> I'm not explicitly asking for help on my assignment. I would just like an
> explanation of why is this happening:
>
> Currently, I'm investigating the effects of destructive modifications on
> variables passed to procedures using Dr. Scheme.

Scheme is a `call-by-value' language.  Variables are *never* passed to
procedures, only the *value* of the variable is passed.  (In fact,
variables themselves are not denotable objects in Scheme.)

> While reading the "Teach yourself Scheme in fixnum days" study book, I found
> this destructive code which reverses a list "in-place".
>
> (define reverse!
>   (lambda (s)
>     (let loop ((s s) (r '()))
>       (if (null? s) r
>       (let ((d (cdr s)))
>             (set-cdr! s r)
>         (loop d s))))))
>
> However, I tried it in the interactions windows like this:
>
> ( define U '( 1 2 3 4 5 6 7 8 ) )
> ( reverse! U )
>
> Not to my surprise, the output was:
>
> (8 7 6 5 4 3 2 1)  
>
> but when I asked the interpreter to reevaluate U, it just displayed:
>
> (1)

Details are important here, so first let me introduce some terms:

  An `identifier' is word in your program. `lambda', `s', `U', and
  `define' are all examples of identifiers.

  Some identifiers, like `U', `s', and `reverse!', refer to
  variables.  Other identifiers, like `define', `lambda', and `let',
  indicate special syntactic forms.

  As the program executes, variables become associated with values.
  Depending on the context, a variable may be associated with many
  different values *at the same time*.  (Consider a recursive
  program.  Each `active' or pending recursive call will have its own
  set of values that are private to that particular invokation.)

  A `binding' is an `instance of association' between a variable and a
  value.  Every time a procedure is called, a new set of bindings is
  created to hold the values of the formal parameters of the procedure.

In a `call-by-value' language, the *values* of variables are passed to
a procedure and new bindings are created.  In a `call-by-name'
language, the *bindings* of variables are passed to a procedure --- no
new bindings are created.

To give a simple example, consider this Scheme program fragment:

   (let ((i 42))
     (foo i)      ;; foo is a procedure
     (display i)  ;; will print 42
     )

The procedure FOO will be passed the *value* of `i' (42) rather than the
*binding* of i.  There is no way to write procedure FOO such that it
will modify what i is bound to.  An attempt like this:

(define (foo x)
  (set! x 69))

will fail because when foo is called, a new binding of X (to the value
42) will be created.  The SET! will modify the new binding, but the
binding that associates i with 42 is unchanged.

When we start manipulating more complicated data structures, things
can get confusing.  It is important to distinguish `modifying a
variable' from `modifying a mutable, shared, aggregate data
structure'.

In your `reverse!' example above, there is no modification of any
variable at all.  This can be easily seen because there is no SET!
anywhere (and no redefinition).  So what is being modified?  A LIST is
an aggregate data structure built out of mutable PAIRs.  Either or
both of the CAR and CDR fields of a pair can be changed.

When we 
  (define U '(1 2 3 4 5 6 7 8))

we are actually defining U to be the first PAIR in a list of 8
elements (the remaining PAIRs can be accessed via following the chain
of CDR fields).

Since Scheme is a call-by-value language, and since furthermore there
is no modification of any variable in the example, the variable U will
always and eternally be bound to that first PAIR, *NO MATTER WHAT WE
DO*.  In fact, since there is no call to SET-CAR! anywhere in sight,
we can be sure that the CAR field of U will always be 1.

We can now explain the behavior you saw:  U is defined to be PAIR.
This PAIR happens to be the first element in an eight element list.
We call REVERSE! to `destructively reverse the list'.  The REVERSE!
function works by modifying the CDR fields of the PAIRS in the list.
As part of this process, the first PAIR will become the last pair, so
its CDR is changed to be the empty list.  The reversed list is
returned.

The binding of U, however, is not --- and cannot --- be changed.  It
*still* refers to what was that first pair.  The REVERSE! procedure
changed the CDR field of the pair that U refers to, so you can no
longer get at the remaining elements of the original list, but the
association between U and the PAIR it was originally bound to is
unchanged.

> Similarly, I studied a predicate removal function from GUILE Scheme, defined
> like this:
>
> CODE
> (define (delete-if! pred l)
>   "Destructive version of `remove-if'."
>   (let delete-if ((l l))
>     (cond ((null? l) '())
>       ((pred (car l)) (delete-if (cdr l)))
>       (else
>        (set-cdr! l (delete-if (cdr l)))
>        l))))
>
> And again, when I tested it in the interactions window like this...
>
> ( define R '( #f #t #t #f #f #t ) )
> ( delete-if! not R )

We can predict at this point that R, which is bound to a PAIR, will
still be bound to that PAIR.  Furthermore, the CAR of that PAIR will
still contain #F.

> The output was:
>
> (#t #t #t)
>
> but when I reevaluated R, I got:
>
> (#f #t #t #t)

As per our prediction.

> Aren't these destructive functions supposed to have modified the contents
> of both U and R?

Nope.  The destructive functions modify the data structures that are
accessible via U and R, but they cannot modify the binding of U and R.

> Or is there anything I'm getting wrong about variable scopes? I ask this
> mainly because I'm having problems with an "in-place" sorting algorithm I
> was asked to develop. 

I'll hazard a guess that you need a way to `swap' the values of two
variables.  There is simply NO WAY to do this via a procedure call in
Scheme.  There are, however, many ways to achieve the desired effect.

  - Use a macro.

    (define-syntax swap! 
      (syntax-rules ()
        ((swap! left right) (let ((temp left))
                              (set! left right)
                              (set! right temp)))))

  - Place the values in a mutable data structure:

  (let ((index1 (make-cell x))
        (index2 (make-cell y)))
    ....
    (swap-cells! index1 index2)
    ....
    )

   (define (swap-cells! left right)
     (let ((temp (cell-ref left)))
       (cell-set! left (cell-ref right))
       (cell-set! right temp)))

    But now you have to use CELL-REF everywhere for your indexes.

   - Fake up call-by-name via closures. Assuming that index1 and
     index2 are lexically apparent variables:

   (swap! 
          ;; A pair of procedures that mimic call-by-name.
          (lambda () index1)
          (lambda (new-value) (set! index1 new-value))

          ;; Another.
          (lambda () index2)
          (lambda (new-value) (set! index2 new-value)))

    (define (swap! read-left write-left! read-right write-right!)
      (let ((temp (read-left)))
        (write-left! (read-right))
        (write-right! temp)))

    This appears ugly, but it is the most general solution. For
    instance, you could swap a variable and an array slot:

    (swap! 
           (lambda () var)
           (lambda (new-value) (set! var new-value))

           (lambda () (vector-ref vec index))
           (lambda (new-value) (vector-set! vec index new-value)))

    Some macros will hide the ugly syntax, and a compiler that can
    inline procedures will make this perform quite well.

       



Posted on the users mailing list.