[racket] Compound mutable list manipulation - 2 qs
On Feb 13, 2012, at 6:33 AM, Rüdiger Asche wrote:
> 1. I understand that mutable pairs are considered somewhat on the dark side of Scheme and Racket, and I can see why, but in my application I can't afford to a) permanently recur down lists to find the tail and b) permanently
(I am not sure what 'permanently' does here but I take it is a translation of 'ewig' or something like that. I'd have used 'always' or something like that.)
> rebuild lists (if for example only one small element changes in the sublist of a complex list, there is no justification for rebuilding the entire list from scratch) and let the garbage collector take up all the work of digesting the leftovers. If Racket has some "cleaner" way to manipulate large lists with a reasonable degree of efficiency, I'd be thankful for pointers. Also if there are adverse side effects to this design, I'd be interested in learning about them.
Mutable cons cells come with Racket so they are perfectly fine. BUT, it is correct that we separate immutable cons cells from mcons.
As Marjin points out, you are not using a list per se but you are implementing a queue data structure with cons cells. And he also correctly reminds readers of the list that there are functional approaches to queue implementation that (in the measured reality of microbenchmarks) come close to imperative versions. For sample code, see galore and especially pfds on http://planet.racket-lang.org/ . There are also queue implementations on planet, though read on:
Having said that, I will admit that my own inclination would be to implement the queue imperatively if it plays such a central role. But before I did so, I would conduct performance measurements that mimic the anticipated use. I would also make sure that my API is generic and that I adhere to the API in the rest of the program. The goal is to be able to replace the queue with a better implementation w/o affecting the client code.
And that brings us to your second question.
> 2. I am looking for an abstraction that makes the use of this mechanism explicit in the code (iow, by looking at the code, one should be able to determine where this mechanism is used and clearly distiguish it from usages of other data types). In C++, one would typically use classes, but I'm not sure whether Racket classes allow the same degree of control over the internal representation of the data in it. Other choices would be syntactic abstractions (which wouldn't prevent improper usage of functions defined on the data, but I think I could live with that) or structures. What is the most rackety way to go about that kind of thing?
You have several options of abstraction, where 'abstraction' presumably means 'encapsulation so that others cannot see how I implemented it' -- two notions easily and almost permanently confused in CS.
1. If your application is a small layer around the queue and if the queue has a thin interface (enq, deq) and if it all fits in a module (say a few 100 lines of code), use opaque and mutable structs. Good enough.
> (struct queue (head tail) #:mutable)
> (define q (queue '() '()))
> q
#<queue>
> (set-queue-head! q 10)
> q
#<queue>
> (queue-head q)
10
2. If you have a broad API to your queue and it still fits all in one module, I'd use a class with private fields. A class is an opaque kind of data in our world (and yes, deep down it is just an opaque struct), but it comes with methods:
> (define queue% (class object% (define head '()) (define tail '()) (define/public (enq v) (set! head v)) (define/public (front) head) (super-new)))
> (define q (new queue%))
> (send q enq 5)
> (send q front)
5
3. If the program is large, break it into several modules and implement the queue according to 1 or 2, exporting either all the functions or the class. I'd probably go with 1, if my application used a single queue and with 2 if I needed more than one.
In this case, I'd still use some contracts to enforce the API despite any performance needs.
-- Mattihas