[racket] Cleaner way to work with gzipped data?

From: David Vanderson (david.vanderson at gmail.com)
Date: Fri Aug 9 14:05:58 EDT 2013

Is this the sort of situation that continuation barriers were made for?  
Do you have any guidance about using them?

#lang racket

(define (only-once thunk)
   (dynamic-wind
    (? () (displayln "pre-thunk"))
    (? () (call-with-continuation-barrier thunk))
    (? () (displayln "post-thunk"))))

(only-once (? () (displayln "hi")))

(let ([saved-k #f])
   (only-once (? () (let/cc k (set! saved-k k)
                      (displayln "saving continuation"))))
   (displayln "invoking continuation...")
   (saved-k 11))

Thanks,
Dave

On 08/08/2013 12:10 PM, Robby Findler wrote:
> As to the interactions with dynamic-wind and continuations: l did, as 
> you figured out, intend for you to use only the post-thunk in the 
> dynamic-wind (to close the pipes). In principle, you could use the 
> pre-thurnk to try to restore the pipe, but there really isn't enough 
> information to do this correctly in all cases.
>
> I see that you've protected things a little bit by using the 
> port-closed? predicate as a guard, but if you did that to protect 
> against possible continuation re-entry, then probably you're better 
> off adding something to the pre-thunk that explicitly raises an error 
> saying that it isn't allowed to re-enter. Something like this:
>
> #lang racket
>
> (define (only-once thunk)
>   (define already-in-once? #f)
>   (dynamic-wind
>    (? ()
>      (when already-in-once? (error 'only-once "no no")))
>    (? ()
>      (set! already-in-once? #t)
>      (thunk))
>    void))
>
> (only-once (? () "hi"))
>
> (let ([saved-k #f])
>   (only-once (? () (let/cc k (set! saved-k k))))
>   (saved-k 11))
>
>
>
>
> On Tue, Aug 6, 2013 at 11:16 AM, JP Verkamp <racket at jverkamp.com 
> <mailto:racket at jverkamp.com>> wrote:
>
>     I've never actually used dynamic-wind, although it does look
>     interesting / like what I need. A few questions / caveats though:
>
>     - Should the pipe be created in the pre-thunk or before the
>     dynamic-wind entirely? The thunks don't seem to share scope, so
>     I'm guessing the latter, but that seems a bit odd. I'm guessing
>     the pre-thunk is for an entirely different use case though when
>     you are actually dealing with closing and reopening resources are
>     the like as control gets passed around.
>
>     - Doesn't dynamic-wind break if the user messes with continuations
>     during the value-thunk? So far as I understand, when control
>     passes out, post-thunk is called and then pre-thunk on the way
>     back in, but that means that when control returns the port will be
>     closed. I don't know how often this will come up, but it seems to
>     break if I nest a thread inside of the with-gzip call. Granted, my
>     version did as well because of the close-input-port call. Is this
>     just expected behavior?
>
>     (And yes, it works fine in the more likely / sensible case of
>     wrapping the entire with-gzip in a thread in both cases.)
>
>     - So far as error rather than raise, raise was my original guess.
>     But that added another layer of indirection to the stack trace
>     which I didn't at first notice (I thought I wasn't even catching
>     the error). It makes sense to have that though in the long run.
>
>     That all being said, how does this version look?
>
>     (define (with-gunzip thunk)
>     (define-values (pipe-from pipe-to) (make-pipe))
>       (dynamic-wind
>        void
>        (? ()
>      (gunzip-through-ports (current-input-port) pipe-to)
>      (close-output-port pipe-to)
>      (parameterize ([current-input-port pipe-from])
>      (thunk)))
>        (? ()
>          (unless (port-closed? pipe-to) (close-output-port pipe-to))
>          (unless (port-closed? pipe-from) (close-input-port pipe-from)))))
>
>
>     On Tue, Aug 6, 2013 at 11:47 AM, Robby Findler
>     <robby at eecs.northwestern.edu <mailto:robby at eecs.northwestern.edu>>
>     wrote:
>
>         You might consider using dynamic-wind instead of that
>         with-handlers. Or, instead of (error 'with-gunzip ...) just do
>         (raise exn). That way you won't lose the stack information in
>         the original exception (which is likely the one a user would
>         want).
>
>         Robby
>
>
>         On Tue, Aug 6, 2013 at 10:40 AM, JP Verkamp
>         <racket at jverkamp.com <mailto:racket at jverkamp.com>> wrote:
>
>             Figured it out and cleaned it up. It turns out that I was
>             using with-handlers oddly, but reading further though the
>             documentation it works as expected. Here's a new version
>             (generalized to any input-port):
>
>             (define (with-gunzip thunk)
>               (define-values (pipe-from pipe-to) (make-pipe))
>               (with-handlers ([exn:fail?
>                                (? (err)
>              (close-output-port pipe-to)
>              (close-input-port pipe-from)
>              (error 'with-gunzip (exn-message err)))])
>             (gunzip-through-ports (current-input-port) pipe-to)
>             (close-output-port pipe-to)
>                 (parameterize ([current-input-port pipe-from])
>                   (thunk))
>                 (close-input-port pipe-from)))
>
>             If anyone's interested in a more in depth write up /
>             source code for this and with-gzip:
>             - writeup:
>             http://blog.jverkamp.com/2013/08/06/adventures-in-racket-gzip/
>             - source:
>             https://github.com/jpverkamp/small-projects/tree/master/blog/with-gzip.rkt
>
>
>             On Mon, Aug 5, 2013 at 5:36 PM, JP Verkamp
>             <racket at jverkamp.com <mailto:racket at jverkamp.com>> wrote:
>
>                 Thanks! make-pipe isn't something that I've had to use
>                 otherwise, so I missed the optional parameter. That
>                 does certainly seem to help.
>
>                 Here's my first take of with-input-from-gzipped-file:
>
>                 (define (with-input-from-gzipped-file filename thunk
>                 #:buffer-size [buffer-size #f])
>                 (call-with-input-file filename
>                 (lambda (file-from)
>                   (define-values (pipe-from pipe-to) (make-pipe
>                 buffer-size))
>                   (thread
>                       (? ()
>                 (gunzip-through-ports file-from pipe-to)
>                 (close-output-port pipe-to)))
>                 (current-input-port pipe-from)
>                   (thunk)
>                   (close-input-port pipe-from))))
>
>                 The main thing missing is that there's no error
>                 handling (where the pipe should still be closed). At
>                 the very least, if I try to call this on a non-gzipped
>                 file, it breaks on the gunzip-through-ports line.
>                 Theoretically, some variation of with-handlers should
>                 work (error should raise an exn:fail?, yes?), but it
>                 doesn't seem to be helping.
>
>                 Any help with that?
>
>                 Alternatively, I've now found this:
>                 http://planet.racket-lang.org/display.ss?package=gzip.plt&owner=soegaard
>
>                 It seems to do exactly what I need, albeit without the
>                 call-with-* forms, but that's easy enough to wrap.
>                 With some very basic testing, it does seem to be
>                 buffering though, although it is a bit slower than the
>                 above. Not enough to cause trouble though.
>
>
>                 On Mon, Aug 5, 2013 at 4:51 PM, Ryan Culpepper
>                 <ryanc at ccs.neu.edu <mailto:ryanc at ccs.neu.edu>> wrote:
>
>                     On 08/05/2013 04:29 PM, JP Verkamp wrote:
>
>                         Is there a nice / idiomatic way to work with
>                         gzipped data in a streaming
>                         manner (to avoid loading the rather large
>                         files into memory at once). So
>                         far as I can tell, my code isn't doing that.
>                         It hangs for a while on the
>                         call to gunzip-through-ports, long enough to
>                         uncompress the entire file,
>                         then reads are pretty quick afterwords.
>
>                         Here's what I have thus far:
>
>                         #lang racket
>
>                         (require file/gunzip)
>
>                         (define-values (pipe-from pipe-to) (make-pipe))
>                         (with-input-from-file "test.rkt.gz"
>                            (lambda ()
>                          (gunzip-through-ports (current-input-port)
>                         pipe-to)
>                              (for ([line (in-lines pipe-from)])
>                          (displayln line))))
>
>
>                     You should probably 1) limit the size of the pipe
>                     (to stop it from inflating the whole file at once)
>                     and 2) put the gunzip-through-ports call in a
>                     separate thread. The gunzip thread will block when
>                     the pipe is full; when your program reads some
>                     data out of the pipe, the gunzip thread will be
>                     able to make some more progress. Something like this:
>
>                     (define-values (pipe-from pipe-to) (make-pipe 4000))
>                     (with-input-from-file "test.rkt.gz"
>                       (lambda ()
>                         (thread
>
>                           (lambda ()
>                     (gunzip-through-ports (current-input-port) pipe-to)
>                     (close-output-port pipe-to)))
>
>                         (for ([line (in-lines pipe-from)])
>                           (displayln line))))
>
>                         As an additional problem, that code doesn't
>                         actually work.
>                         in-lines seems to be waiting for an
>                         eof-object? that
>                         gunzip-through-ports isn't sending. Am I
>                         missing something? It ends up
>                         just hanging after reading and printing the file.
>
>
>                     The docs don't say anything about closing the
>                     port, so you'll probably have to do that yourself.
>                     In the code above, I added a call to
>                     close-output-port.
>
>                     Ryan
>
>
>
>
>             ____________________
>               Racket Users list:
>             http://lists.racket-lang.org/users
>
>
>
>
>     ____________________
>       Racket Users list:
>     http://lists.racket-lang.org/users
>
>
>
>
> ____________________
>    Racket Users list:
>    http://lists.racket-lang.org/users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20130809/87993060/attachment-0001.html>

Posted on the users mailing list.