[racket] Understanding GC when working with streams

From: Lawrence Woodman (lwoodman at vlifesystems.com)
Date: Mon Sep 9 04:19:32 EDT 2013

On 09/09/13 07:19, Stephen Chang wrote:
> Konrad's exactly right. Your filtered-nums blows up because you named
> the stream and then traversed the entire thing. In general, if you
> hang onto the head of the stream while traversing then the GC can't
> collect anything because since you have a pointer to the head, every
> element of the stream is still reachable.
>
> Compare to a common traversal pattern like:
>
>      (let loop ([s <some stream>]) ... (loop (stream-rest s)))
>
> where the head is dropped on each loop iteration.
>
> Other comments:
> - (stream-length (gen-filtered-nums)) is fine because there's no
> pointer to the head, so the GC collects as you traverse.
>
> - in-range is fine because it's constant space and not a
> cons-cell-based stream. It's more like a generator.
>
> - the for/sum is actually collecting while traversing, just more
> slowly. I'm not exactly sure why, I may look into it. On my machine,
> it got up to 500mb or so but it finished.

Thanks, and to Konrad too.  You're mention of in-range being more
like a generator, actually makes me wonder whether a generator would
be a better choice for processing large data sets, from databases and
csv files, sequentially.  What do you think?


Lorry

-- 
vLife Systems Ltd
Registered Office: The Meridian, 4 Copthall House, Station Square, Coventry, CV1 2FL
Registered in England and Wales No. 06477649
http://vlifesystems.com


Posted on the users mailing list.