[plt-scheme] Unifying xexpr, SXML, and SHTML

From: Neil Van Dyke (neil at neilvandyke.org)
Date: Tue Mar 10 04:19:40 EDT 2009

Any thoughts on this?

My Scheme libraries for HTML and XML right now use a variation on SXML.

I'm thinking of moving all my tools to a variation on "xexpr" that is 
also (or can be) a valid subset of SXML.

This would let my libraries work better with PLT libraries that use 
"xexpr" and "xml", while retaining compatibility with the various SXML 
tools by Oleg Kiselyov and Kirill Lisovsky.

It would also officially excuse me from having to support the arbitrary 
list nesting of SXML, which I've found hides programming errors by 
library users.  (The drawback to eliminating this arbitrary nesting is 
that some construction of "xexpr" can require large list splicing that 
would not be necessary in SXML.)

For portability and human readability, this variation on "xexpr" would 
be expressed entirely in pairs, symbols, strings, and numbers, *not*

Just for the sake of discussion, let's call this variation "xepxr2".

Here are the proposed differences "xexpr2" has relative to "xexpr":

* An "@" symbol as the first element of an attributes list is permitted 
but not required.  This is for SXML compatibility.

* CDATA is represented as a list with the symbol "*CDATA*" as the first 
element, followed by one of more strings, which, concatenated, are the 
exact CDATA content, *not* including the XML bracketing syntax.  Some 
tools may support an "xexpr" "struct"-like "cdata" representation as an 
alternative, but must accept this list form.

* Comments are represented as CDATA, except the symbol is "*COMMENT*".

* XML processing instructions has a specialized form, with the first 
element being the symbol "*PI*".

* DOCTYPE has a specialized list form, with the first element being the 
symbol "*DOCTYPE*".

There would also be a variation on "xexpr2", which would permit other 
object types to appear in content and attribute values.  The main 
purpose of this is to permit absolute URI objects to be embedded in 
"xexpr2" for last-minute reverse-resolution by HTML writers, for 
robustness in complicated apps:
http://www.neilvandyke.org/weblog/2005/01/#2005-01-31

Source location information might be embedded too, as "struct"-like 
objects that can be dropped into the list syntax as markers and normally 
ignored by processing.  I haven't decided on this, but it's something 
that will be needed sometimes, and we might want to have it in "xexpr2" 
rather than having a separate representation like PLT's "xml".

Any thoughts on this appreciated.

Thanks,
Neil

-- 
http://www.neilvandyke.org/


Posted on the users mailing list.