[racket] New users, XML, and xml/path

From: Jay McCarthy (jay.mccarthy at gmail.com)
Date: Tue Jun 18 20:53:03 EDT 2013

On Tue, Jun 18, 2013 at 6:56 AM, Greg Hendershott
<greghendershott at gmail.com> wrote:
> I noticed a recent IRC exchange where someone was frustrated with
> se-path*/list, and felt it didn't help enough compared to Python's
> standard lib for this.[^1] They were trying to get the <row> elements
> from an API like this[^2]:
>
> <?xml version='1.0' encoding='UTF-8'?>
> <eveapi version="2">
>   <currentTime>2011-09-10 14:41:29</currentTime>
>   <result>
>     <key accessMask="59760264" type="Character" expires="2011-09-11 00:00:00">
>       <rowset name="characters" key="characterID"
> columns="characterID,characterName,corporationID,corporationName">
>         <row characterID="898901870" characterName="Desmont McCallock"
> corporationID="1000009" corporationName="Caldari Provisions" />
>       </rowset>
>     </key>
>   </result>
>   <cachedUntil>2011-09-10 14:46:29</cachedUntil>
> </eveapi>
>
> Now, in the past I've dealt with XML by hand-coding, e.g. using
> `match`. Which has worked fine for me. But I'd never really noticed
> xml/path (was it added recently??) so I sat down to try it.

It was added (with documentation) on July 24th, 2011 to celebrate Pioneer Day.

> Maybe I
> should use that more. But a few things I noticed:
>
> 1. Name: Why the * in `se-path*` and `se-path*/list` names? By
> convention this made me look for the non-* versions of these, which I
> couldn't find.

The * is to suggest that it is like the XPath query that starts with
// or a .* at the beginning. There isn't a non-* version because I
didn't implement one.

> 2. Whitespace: When I try `(se-path*/list '(rowset) xe)` I get
>
> '("\n        " (row ((characterID "898901870") (characterName "Desmont
> McCallock") (corporationID "1000009") (corporationName "Caldari
> Provisions"))) "\n      ")

That query asks for the content inside a <rowset> tag. If you have
whitespace, then that's part of the content. In XML, whitespace is not
optional or superfluous, so this is the correct result.

> Even with collapse-whitespace parameter set #t, I get
>
> '(" " (row ((characterID "898901870") (characterName "Desmont
> McCallock") (corporationID "1000009") (corporationName "Caldari
> Provisions"))) " ")

collapse-whitespace reduces all whitespace to a single space. I could
imagine an eliminate-whitespace parameter that would remove it
altogether, but that doesn't exist and isn't collapse-whitespace.

> I understand why, and what to do about it. But I'm not sure about a
> new user. Probably I would instead try the following...
>
> 3. Can't get elements with their attributes:  If I try `(se-path*/list
> '(rows) xe)` I get back '(). Huh.  I guess that's because the <row>
> elements have empty bodies -- the interesting stuff is in their
> attributes. OK, but I don't see how to get the element with attributes
> conveniently, using xml/path.

The documentation says: "The prefix of symbols specifies a path of
tags from the leaves with an implicit any sequence to the root. The
final, optional keyword specifies an attribute." meaning that if you
want an attribute than you could do (se-path*/list '(row
#:characterID) xe). It sounds like what you want are all the
attributes of the contents of rows. I would do:

(map second (filter list? (se-path*/list '(rows) xe)))


> Although I'm not very familiar with XPath, I think there probably is a
> role for something simpler than it, provided with Racket itself, and
> xml/path is a good idea.

That's exactly the idea. I needed something simple to test the Web
server with without relying on SXML & SXPath.

> Does it make sense to enhance it just a bit
> more, though?

Yup

> p.s. If the original author of xml/path can't or doesn't want to do
> this, I'd volunteer to take a shot and create a pull request. However
> I know so little about XPath, I'm not sure whether that makes me the
> best or worst person to be doing this (I'm mostly not joking).

Please feel free. I'm comfortable making changes, although the more
specific (i.e. test cases) the better.

Jay

> [1]: http://docs.python.org/2/library/xml.etree.elementtree.html
> [2]: http://wiki.eve-id.net/APIv2_Account_APIKeyInfo_XML
> ____________________
>   Racket Users list:
>   http://lists.racket-lang.org/users



--
Jay McCarthy <jay at cs.byu.edu>
Assistant Professor / Brigham Young University
http://faculty.cs.byu.edu/~jay

"The glory of God is Intelligence" - D&C 93

Posted on the users mailing list.