[plt-scheme] XML ease of use

From: Jay McCarthy (jay.mccarthy at gmail.com)
Date: Tue Aug 4 12:10:16 EDT 2009

Hi Eddie,

On Tue, Aug 4, 2009 at 9:51 AM, Eddie Sullivan<eddieSull at hotmail.com> wrote:
> Hello.
> I've lately been having all kinds of frustrations making things work with
> the XML library. The biggest hurdle seems to be keeping track of all the
> types. We have "document" and "element" and "content" and "prolog" and
> "entity" and "pcdata" and so on and so forth. Every function seems to accept
> a different type.

The contracts in the XML library are very strict and specific to
ensure that only correct XML (modulo the cdata escape valve) is
generated. Each function accepts a different type because that is what
they must accept. Perhaps what you'd like is some stratum of
write/read-xml and xml->xexpr variants that deal with only the
subtypes of document and element. These are implemented in the collect
and could conceivably be exposed. Is that what you mean?

> I think it would be helpful if there were some kind of abstract super-type
> for at least some of these things. Something similar to what the "content/c"
> contract checks for would be ideal, so that all those items could be treated
> the same until we wanted to differentiate them. Except I would think that
> "pcdata" and the like would be subtypes of "element."

It sounds like you are saying that you'd like dispatching inside of
the XML collect rather than selection outside of the XML collect. This
is an imaginable supplement and I would consider including it in the
collect, but I don't think the core library should change in that
error-prone direction. With this approach, when you give the wrong
type it is an error, in that regime you would get different behavior
because a different dispatch would be invoked.

> One other thing that bites me every time is that the "content" field of the
> "element" structure is actually specified by the contract (listof
> content/c). Either the name is wrong or the contract is.

Unfortunately 'content' can be plural or singular in English. The
element has content, that content is made up of a list of content
item. A content item's contract is content/c and the element contains
a number of these. Perhaps if it were easy to change from content to
contents and not break too much code, I'd consider a change.

> I don't know if self-referential contracts are possible, but it would be
> nice if content/c was really something like (or/c
> (current-definition-of-content/c) (listof content/c)). That's only one
> possible solution, though.

That would be incorrect in other cases, such as write-xml/content and
xexpr->xml.

> The reason I don't like getting this (listof content/c) is that there is not
> much useful I can do with it. Since it doesn't satisfy content/c, I can't
> pass it to xml->xexpr or write-xml/content. Since it's not a document, I
> can't pass it to write-xml. I suppose I could do (apply string-append (map
> write-xml thing)), but that seems more involved than it has to be. If I
> could treat (listof content/c) the same as content/c that would go a long
> way towards making things easier.

You could write:

(define (write-xml/contents cs)
 (for-each write-xml/content cs))

and use the same with-output-to-string that you would normally.

This is the sort of thing that a patch might be reasonable.

> A form like "xml-match" would be very helpful, too, to avoid having to delve
> into the internals of the different xml structures.

What would this do? Would it be an XQuery?

> I think PLT Scheme's XML capabilities are very close to being the best thing
> out there, so I hope you take these as constructive suggestions.
> Thanks!
> -Eddie Sullivan

Thank _you_.

Jay

-- 
Jay McCarthy <jay at cs.byu.edu>
Assistant Professor / Brigham Young University
http://teammccarthy.org/jay

"The glory of God is Intelligence" - D&C 93


Posted on the users mailing list.