[plt-scheme] self-referential data structures

From: Marco Morazan (morazanm at gmail.com)
Date: Fri Mar 6 11:42:20 EST 2009

On Thu, Mar 5, 2009 at 10:08 PM, Stephen Bloch <sbloch at adelphi.edu> wrote:
>
> However, there are more complicated self-referential data types that are
> difficult to do without mutation.

I think Stephen means self-referential data definitions (not data types).

> For example, consider
> (define biglist (list 10 9 8 7 6 5 4 3 2 1 0))
> (define partlist (rest (rest (rest (rest biglist)))))
> From "partlist", you can easily get the 5th element of the original list, or
> move forward to the the 6th element and so on... but there's no way to "move
> backward" through the original list to see the 3rd element of the original
> list, because "partlist" doesn't contain that information.

Yes, that is correct. The data definition for lists does not specify
previous elements as a part of a list. Therefore, it comes as no
surprise that we are unable to access the previous elements.

> For some
> applications it's very useful to be able to move both backward and forward
> through a large list, and the usual data structure to do that is called a
> "doubly linked list".  Since the 5th element of a list has to "know" about
> both the 4th and 6th elements, both of which have to "know" about the 5th,
> the usual way to build a doubly linked list is to create the 6th element,
> then create the 5th element referring to it, then MUTATE the 6th element so
> that it knows about the 5th element in turn, then create the 4th element
> referring to the 5th element, then MUTATE the 5th element so  it knows about
> the 4th element, and so  on.  This is difficult to do without mutation.
>

Respectfully, I disagree. This is a poor justification for using
assignment. The data definition for a list not being well-suited for
accessing previous elements does not mean we should immediately turn
to assignment to "save the day." Instead, we should create a new data
definition. Assignment should be used when functions need to
communicate values to each other without calling each other (or in
some cases for efficiency reasons). Implementing the behavior Stephen
describe's (i.e. accessing previous elements) can be done without
assignment. There is no need to introduce assignment for such a task.
Henk has provided you with an example. Here is what I wrote (very
similar to Henk's code) in HtDP Intermediate Language:

;A dlist is a structure, (make-dlist item p ls), where
;item is an X, p is a (listof X) and ls is a (listof X)

(define-struct dlist (item prev next))

; Observers

; dlist-current: (dlistof X) --> X
; Purpose: To return the current item of the given dlist.
(define (dlist-current dl) (dlist-item dl))


; Constructors

; dlist-rest: dlist --> dlist
; Purpose: To return the dlist for the rest of the elements
;               of the given dlist
(define (dlist-rest dl)
  (make-dlist (first (dlist-next dl))
              (cons (dlist-current dl) (dlist-prev dl))
              (rest (dlist-next dl))))

; dlist-previous: dlist --> dlist
; Purpose: To return the dlist for the previous elements
;               of the given dlist
(define (dlist-previous dl)
  (make-dlist (first (dlist-prev dl))
              (rest (dlist-prev dl))
              (cons (dlist-current dl) (dlist-next dl))))

; EXAMPLES

(define dl1 (make-dlist 10 '() '(9 8 7 6 5 4 3 2 1 0)))
(define dl2 (dlist-rest (dlist-rest (dlist-rest (dlist-rest dl1)))))

(check-expect (dlist-current dl2) 6)
(check-expect (dlist-current (dlist-previous dl2)) 7)
(check-expect (dlist-current (dlist-rest dl2)) 5)


So, there you have "doubly linked lists" without using assignment. Is
this really more difficult to develop than an implementation that uses
assignment? This is the type of stuff a first semester undergrad
should be able to do. The only insight they would need is that a dlist
has 3 components instead of 2 like a list (i.e. first and rest).

> However, if you basically like functional programming but need a
> doubly-linked list for one part of your program, you can wrap up the
> mutating doubly-linked list in a "front end" that looks functional, even
> though there's actually some mutation going on behind the scenes.
>

I believe we can agree that there is no need for all that.

We can now ask why almost all Data Structure books immediately jump to
use assignment when discussing doubly linked lists (and almost every
other data structure). It seems to me that most people can not (or are
not aware of how to) separate the interface/specification/definition
of data from the implementation of the
interface/specification/definition. This is an essential skill that is
poorly understood judging from the majority of published textbooks.
HtDP is a good first step in learning how to separate interfaces from
implementation.

Aditya -- don't even think about assignment until you reach the last
two parts of HtDP. You will discover that programming culture in
general is unnecessarily contaminated with abuses of assignment. Yes,
assignment is useful in some situations, but it is not needed at all
in a great number of situations. Although it is not explicitly stated,
that is a lesson, as a person with previous programming experience,
you will walk away with from going through HtDP.

-- 

Cheers,

Marco


Posted on the users mailing list.