[racket] mime/multipart parsing

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Sat Jan 7 23:13:34 EST 2012

Looking at this again, I see that `net/mime' expects a complete message
--- header and body, but no extra prefix --- so there shouldn't be a
"--9nbsYRvJBLR..." line before the "Content-Type" header. By itself,
that line is essentially being ignored as an ill-formed header element.
Adding a blank line at the start of your input makes the header empty,
instead.

Was this input perhaps extracted as a part of an enclosing multi-part
message? (Maybe not using `net/mime' for that outer message?)

SirMail, which has been my mail client, uses `net/mime', so I'm fairly
confident that the library works on real messages --- no problems in
the last decade or so.

At Sat, 7 Jan 2012 11:41:57 -0700, Jordan Schatz wrote:
> Thank you Matthew,
> 
> The message I was using did have CRLF line endings, but it had one too
> many:
> 
> ----------------------------------------------------------------------
> #lang racket
> 
> (require net/mime)
> 
> (define message-string
>   (let ([sep "\r\n"])
>     (string-append 
>      sep ;;This message starts with a \r\n <- EXTRA CRLF
>      "--9nbsYRvJBLRyuL4VOuuejw9LcAy" sep
>      "Content-Type: multipart/mixed; boundary=NdzDrpIQMsJKtfv9VrXmp4YwCPh" sep
>      sep
>      "--NdzDrpIQMsJKtfv9VrXmp4YwCPh" sep
>      "X-Riak-Vclock: a85hYGBgzGDKBVIcypz/fvp9087NYEpkzGNlaGCpPMGXBQA=" sep
>      "Location: /buckets/invoices/keys/RAQpCw8SssXlXVhiGAGYXsVmwvk" sep
>      "Content-Type: application/json" sep
>      "Link: </buckets/invoices>; rel='up'" sep
>      "Etag: 1qS8Wrr2vkTBxkITOjo33K" sep
>      "Last-Modified: Wed, 04 Jan 2012 17:12:32 GMT" sep
>      sep
>      "{ 'date': '11/02/2011' }" sep
>      "--NdzDrpIQMsJKtfv9VrXmp4YwCPh--" sep)))
> 
> (define ip
>   (open-input-string
>    message-string))
> 
> (let* ([analyzed (mime-analyze ip)] ;; port -> #<message>
>        [our-entity (message-entity analyzed)] ;; grab #<entity> of this message
>        [parts (entity-parts our-entity)] ;; #<entity> -> list of (inner) 
> #<message>
>        [inner-message (first parts)] ;; I only have one, grab it
>        [inner-entity (message-entity inner-message)] ;; get its #<entity> part
>        [body-proc (entity-body inner-entity)] ;; create a proc that returns the 
> #<entity> body
>        [tmp (open-output-string)]) 
>   (write (message-fields inner-message)) ;;Should be a string of headers? 
> Actual '()
>   (body-proc tmp) ;; call proc to get message body, it needs an output port
>   (write (get-output-string tmp))) ;; Should be json data? Actual ""
> ----------------------------------------------------------------------
> 
> It looks like the message I am working with doesn't conform to RFC. But
> it also seems like a sane thing for the net/mime library to check for and
> handle? If so I should be able to add it and send a pull request.
> 
> Shalom,
> Jordan
> 
> On Sat, Jan 07, 2012 at 06:12:22AM +0100, Matthew Flatt wrote:
> > I think the main problem is that the input string has LF newlines, and
> > it needs to have CRLF newlines. You'll also want a terminating CRLF.
> > 
> > With those changes, then `(message-fields inner-message)' instead of
> > `(entity-fields inner-entity)' will get you the headers that include
> > "X-Riak-Vclock: ...". The result of `(entity-fields inner-entity)'
> > would correspond to a further multipart document in the inner message.
> > 
> > At Fri, 6 Jan 2012 20:17:21 -0700, Jordan Schatz wrote:
> > > I'm having difficulties parsing mime multipart messages (probably I
> > > missed something in the docs again). I have this code:
> > > 
> > > ----------------------------------------------------------------------
> > > #lang racket
> > > 
> > > (require net/mime)
> > > 
> > > (define ip
> > >   (open-input-string
> > >    "--9nbsYRvJBLRyuL4VOuuejw9LcAy
> > > Content-Type: multipart/mixed; boundary=NdzDrpIQMsJKtfv9VrXmp4YwCPh
> > > 
> > > --NdzDrpIQMsJKtfv9VrXmp4YwCPh
> > > X-Riak-Vclock: a85hYGBgzGDKBVIcypz/fvp9087NYEpkzGNlaGCpPMGXBQA=
> > > Location: /buckets/invoices/keys/RAQpCw8SssXlXVhiGAGYXsVmwvk
> > > Content-Type: application/json
> > > Link: </buckets/invoices>; rel='up'
> > > Etag: 1qS8Wrr2vkTBxkITOjo33K
> > > Last-Modified: Wed, 04 Jan 2012 17:12:32 GMT
> > > 
> > > { 'date': '11/02/2011' }
> > > --NdzDrpIQMsJKtfv9VrXmp4YwCPh--"))
> > > 
> > > (let* ([analyzed (mime-analyze ip)] ;; port -> #<message>
> > >        [our-entity (message-entity analyzed)] ;; grab #<entity> of this 
> message
> > >        [parts (entity-parts our-entity)] ;; #<entity> -> list of (inner) 
> > > #<message>
> > >        [inner-message (first parts)] ;; I only have one, grab it
> > >        [inner-entity (message-entity inner-message)] ;; get its #<entity> 
> part
> > >        [body-proc (entity-body inner-entity)] ;; create a proc that returns 
> the 
> > > #<entity> body
> > >        [tmp (open-output-string)]) 
> > >   (write (entity-fields inner-entity)) ;;Should be a list of string of 
> headers? 
> > > Actual '()
> > >   (body-proc tmp) ;; call proc to get message body, it needs an output port
> > >   (write (get-output-string tmp))) ;; Should be json data? Actual ""
> > > ----------------------------------------------------------------------
> > > 
> > > I thought it would write the headers, and message body, but instead I get
> > > an empty list, and an empty string. I've been at it for a few hours and I
> > > don't see what is wrong....
> > > 
> > > Thanks : )
> > > Jordan
> > > ____________________
> > >   Racket Users list:
> > >   http://lists.racket-lang.org/users


Posted on the users mailing list.