[plt-scheme] XML ease of use

From: Jay McCarthy (jay.mccarthy at gmail.com)
Date: Tue Aug 4 12:34:34 EDT 2009

On Tue, Aug 4, 2009 at 10:22 AM, Eddie Sullivan<eddieSull at hotmail.com> wrote:
> Thanks for the reply, Jay.
> I guess your point is that since XML is strict about types, the library must
> be as well. That does make sense.
>
> I think I'll try Robby's suggested approach of converting xml to xexpr,
> processing, then converting back to xml. Is there any potential data loss I
> need to worry about with that approach?

The xexpr syntax drops the prologs, misc entries, and p-is. Those are
all at the top though, so you can recover them fairly easily.

Jay

>
> (BTW, I hope the tone of my original email wasn't too grumpy. It's still
> early in the morning here.)
>
> -Eddie Sullivan
>
> ----- Original Message ----- From: "Jay McCarthy" <jay.mccarthy at gmail.com>
> To: "Eddie Sullivan" <eddieSull at hotmail.com>
> Cc: <plt-scheme at list.cs.brown.edu>
> Sent: Tuesday, August 04, 2009 9:10 AM
> Subject: Re: [plt-scheme] XML ease of use
>
>
>> Hi Eddie,
>>
>> On Tue, Aug 4, 2009 at 9:51 AM, Eddie Sullivan<eddieSull at hotmail.com>
>> wrote:
>>>
>>> Hello.
>>> I've lately been having all kinds of frustrations making things work with
>>> the XML library. The biggest hurdle seems to be keeping track of all the
>>> types. We have "document" and "element" and "content" and "prolog" and
>>> "entity" and "pcdata" and so on and so forth. Every function seems to
>>> accept
>>> a different type.
>>
>> The contracts in the XML library are very strict and specific to
>> ensure that only correct XML (modulo the cdata escape valve) is
>> generated. Each function accepts a different type because that is what
>> they must accept. Perhaps what you'd like is some stratum of
>> write/read-xml and xml->xexpr variants that deal with only the
>> subtypes of document and element. These are implemented in the collect
>> and could conceivably be exposed. Is that what you mean?
>>
>>> I think it would be helpful if there were some kind of abstract
>>> super-type
>>> for at least some of these things. Something similar to what the
>>> "content/c"
>>> contract checks for would be ideal, so that all those items could be
>>> treated
>>> the same until we wanted to differentiate them. Except I would think that
>>> "pcdata" and the like would be subtypes of "element."
>>
>> It sounds like you are saying that you'd like dispatching inside of
>> the XML collect rather than selection outside of the XML collect. This
>> is an imaginable supplement and I would consider including it in the
>> collect, but I don't think the core library should change in that
>> error-prone direction. With this approach, when you give the wrong
>> type it is an error, in that regime you would get different behavior
>> because a different dispatch would be invoked.
>>
>>> One other thing that bites me every time is that the "content" field of
>>> the
>>> "element" structure is actually specified by the contract (listof
>>> content/c). Either the name is wrong or the contract is.
>>
>> Unfortunately 'content' can be plural or singular in English. The
>> element has content, that content is made up of a list of content
>> item. A content item's contract is content/c and the element contains
>> a number of these. Perhaps if it were easy to change from content to
>> contents and not break too much code, I'd consider a change.
>>
>>> I don't know if self-referential contracts are possible, but it would be
>>> nice if content/c was really something like (or/c
>>> (current-definition-of-content/c) (listof content/c)). That's only one
>>> possible solution, though.
>>
>> That would be incorrect in other cases, such as write-xml/content and
>> xexpr->xml.
>>
>>> The reason I don't like getting this (listof content/c) is that there is
>>> not
>>> much useful I can do with it. Since it doesn't satisfy content/c, I can't
>>> pass it to xml->xexpr or write-xml/content. Since it's not a document, I
>>> can't pass it to write-xml. I suppose I could do (apply string-append
>>> (map
>>> write-xml thing)), but that seems more involved than it has to be. If I
>>> could treat (listof content/c) the same as content/c that would go a long
>>> way towards making things easier.
>>
>> You could write:
>>
>> (define (write-xml/contents cs)
>> (for-each write-xml/content cs))
>>
>> and use the same with-output-to-string that you would normally.
>>
>> This is the sort of thing that a patch might be reasonable.
>>
>>> A form like "xml-match" would be very helpful, too, to avoid having to
>>> delve
>>> into the internals of the different xml structures.
>>
>> What would this do? Would it be an XQuery?
>>
>>> I think PLT Scheme's XML capabilities are very close to being the best
>>> thing
>>> out there, so I hope you take these as constructive suggestions.
>>> Thanks!
>>> -Eddie Sullivan
>>
>> Thank _you_.
>>
>> Jay
>>
>> --
>> Jay McCarthy <jay at cs.byu.edu>
>> Assistant Professor / Brigham Young University
>> http://teammccarthy.org/jay
>>
>> "The glory of God is Intelligence" - D&C 93
>>
>
>



-- 
Jay McCarthy <jay at cs.byu.edu>
Assistant Professor / Brigham Young University
http://teammccarthy.org/jay

"The glory of God is Intelligence" - D&C 93


Posted on the users mailing list.