[racket] Compound mutable list manipulation - 2 qs

From: Stephen Bloch (bloch at adelphi.edu)
Date: Mon Feb 13 14:20:51 EST 2012

On Feb 13, 2012, at 10:06 AM, Rüdiger Asche wrote:

>> To get the best suggestions on alternative data structures you should
>> say a little more about wha your program does and thus needs. However
>> if all you're interested in is adding to the back and removing from
>> the front, that is a queue, you can implement one using two lists
>> without mutating the lists. One list represents the front and you can
>> cdr down it efficiently, the other list represents the back in reverse
>> and you cons elements onto it to add elements to the back of the
>> queue. Whenever the front list becomes empty you reverse the back list
>> to form the new front and the back becomes empty. This way you get a
>> queue with O(1) amortized enqueue and dequeue operations.
> That's an interesting approach, thanks for suggesting it, Marijn. I just wonder about the penalty of the reversal part and how it figures in?

The reversal should take linear time, and be performed only every time the whole queue is traversed -- as Marijn said, O(1) amortized cost.  Whether that's significant or not depends on how much work you typically do with a queue element: if all you're doing is counting queue elements, the added work of this data structure is comparable to what you're already doing, but if you're doing a lot of computation on each element, the added work is insignificant.

> ... I guess that it could be more or less neglectable provided there are few queues in which there are typically more adds than removals. However, in the application I'm working on (it's actually an emulation of an operating system using queues to manage task schedulers, event managers, interrupts and so on) there is no prediction whatsoever about the runtime behavior of the queues - there'll be nested queues, large queues, small queues, frequently accessed vs. mostly idle queues, queues more heavily removing than adding items and vice versa and so on.

And no matter which of these you look at, it's still O(1) amortized cost.  Amortized cost isn't an appropriate measure if you're worried about realtime response, but for a discrete-event simulation with lots of events, it's perfectly appropriate.

> So in some of the queues there may be hefty swapping of queue fragments and efficiency may again become an issue.

I don't see that.  In the suggested data structure, you never copy a large fragment, only pointers; the only time you do ANYTHING to a large fragment is the occasional reversal.  And the longer the queues, the less frequent this is.

> Another thing is that again quite some heavy burden is left on the garbage collector in this architecture which I'd like to avoid.
> I'll implement both techniques (as Matthias pointed out, a good interface definition allows for easy replacement) and see if I can run some meaningful comparison tests.

Data are good.  It would be very interesting to see whether the burden on the garbage collector really IS as heavy as you think.

"Premature optimization is the root of all evil" -- somebody somewhere

Stephen Bloch
sbloch at adelphi.edu

Posted on the users mailing list.