[racket] se-path* returning multiple strings when tag contains XML entities

From: Jay McCarthy (jay.mccarthy at gmail.com)
Date: Mon Dec 9 18:45:01 EST 2013

Hi Giacomo,

I think I would do this:

(define (xe->string n)
    (string-append* (rest (rest n))))

  (check-equal? (map xe->string (se-path*/list '(bands) xe))
                '("Derek & the Dominos" "Nick Cave & the Bad Seeds"))

Because you want the children of "bands" and you want to turn each one
into a string.


On Sat, Dec 7, 2013 at 6:30 PM, Giacomo Ritucci
<giacomo.ritucci at gmail.com> wrote:
> Hi Jay,
>
> thanks for your reply.
>
> Unfortunately I can't find a way in my code to detect that in the resulting
> list from se-path*/list
>
>
>     '("Derek " "&" " the Dominos" "Nick Cave " "&" " the Bad Seeds")
>
> the first three elements should be actually treated as a single string and
> so the last three.
>
> Is there a common idiom in Racket to extract a list of values from an XML
> collection, in a way that works with & and other entities?
>
> Thanks in advance.
>
>
> On Mon, Dec 2, 2013 at 9:27 PM, Jay McCarthy <jay.mccarthy at gmail.com> wrote:
>>
>> Hi Giacomo,
>>
>> First, the question is not really about se/list, because if you look
>> at the xexpr you're giving it, the "name" node has three string
>> children:
>>
>> '(bands () (name () "Derek " "&" " the Dominos") (name () "Nick Cave "
>> "&" " the Bad Seeds"))
>>
>> And se/list* gives you these children all appended together. If you
>> got the name nodes themselves, then you could concatenate their
>> children.
>>
>> Second, there real question is about why parsing XML works like that.
>> If you look at this:
>>
>> (define xs
>>   "<bands><name>Derek & the Dominos</name><name>Nick Cave &
>> the Bad Seeds</name></bands>")
>> (define x
>>   (read-xml/document (open-input-string xs)))
>> x
>>
>> Then you'll see that the core is that name doesn't have a single piece
>> of PCDATA. It has three, one of which is an entity.
>>
>> I don't consider this an error in the XML parser, but a consequence of
>> XML entities that might not be obvious: they are their only nodes in
>> the list of children of the parent node.
>>
>> Jay
>>
>>
>> On Sun, Dec 1, 2013 at 8:36 AM, Giacomo Ritucci
>> <giacomo.ritucci at gmail.com> wrote:
>> > Hi Racket Users,
>> >
>> > I'm using se-path*/list to extract values from an XML collection but I
>> > found
>> > a strange behaviour when the extracted values contain entities.
>> >
>> > For example, given the following XML:
>> >
>> > <bands>
>> >     <name>Derek & the Dominos</name>
>> >     <name>Nick Cave & the Bad Seeds</name>
>> > </bands>
>> >
>> > when I extract a list of band names with (se-path*/list '(name) xe) I'd
>> > expect this result:
>> >
>> >     '("Derek & the Dominos" "Nick Cave & the Bad Seeds")
>> >
>> > but what I actually receive is:
>> >
>> >     '("Derek " "&" " the Dominos" "Nick Cave " "&" " the Bad Seeds")
>> >
>> > Is this the intended behaviour? How can I overcome this and make
>> > se-path*/list return one string for tag?
>> >
>> > Here's my test code, I'm running Racket v5.3.6 on Linux x86_64 and maybe
>> > I'm
>> > doing overlooking something because I'm new to Racket.
>> >
>> > Thank you in advance!
>> >
>> > Best regards,
>> > Giacomo
>> >
>> > #lang racket
>> >
>> > (require xml
>> >          xml/path)
>> >
>> > (define xe (string->xexpr "<bands><name>Derek & the
>> > Dominos</name><name>Nick Cave & the Bad Seeds</name></bands>"))
>> >
>> > (module+ test
>> >   (require rackunit)
>> >
>> >   ;; what I get
>> >   (check-equal? (se-path*/list '(name) xe)
>> >                 '("Derek " "&" " the Dominos" "Nick Cave " "&" " the Bad
>> > Seeds"))
>> >
>> >   ;; what I'd expect
>> >   (check-equal? (se-path*/list '(name) xe)
>> >                 '("Derek & the Dominos" "Nick Cave & the Bad Seeds")))
>> >
>> > ____________________
>> >   Racket Users list:
>> >   http://lists.racket-lang.org/users
>> >
>>
>>
>>
>> --
>> Jay McCarthy <jay at cs.byu.edu>
>> Assistant Professor / Brigham Young University
>> http://faculty.cs.byu.edu/~jay
>>
>> "The glory of God is Intelligence" - D&C 93
>
>



-- 
Jay McCarthy <jay at cs.byu.edu>
Assistant Professor / Brigham Young University
http://faculty.cs.byu.edu/~jay

"The glory of God is Intelligence" - D&C 93

Posted on the users mailing list.