<HTML><FONT FACE=arial,helvetica><FONT SIZE=2 FAMILY="SANSSERIF" FACE="Arial" LANG="0">In a message dated 8/5/2002 6:58:12 AM Central Daylight Time, markj@cloaked.freeserve.co.uk writes:<BR>
<BR>
<BLOCKQUOTE TYPE=CITE style="BORDER-LEFT: #0000ff 2px solid; MARGIN-LEFT: 5px; MARGIN-RIGHT: 0px; PADDING-LEFT: 5px">> but for XML, you would <BR>
> never want to have to write the pattern in terms of the underlying data<BR>
> structure. <BR>
<BR>
Never?</FONT><FONT COLOR="#000000" style="BACKGROUND-COLOR: #ffffff" SIZE=2 FAMILY="SANSSERIF" FACE="Arial" LANG="0"></BLOCKQUOTE><BR>
<BR>
</FONT><FONT COLOR="#000000" style="BACKGROUND-COLOR: #ffffff" SIZE=2 FAMILY="SANSSERIF" FACE="Arial" LANG="0">Keep in mind that I am refering here to the underlying WebIt! data structures.<BR>
The structures are not opaque in WebIt!, so try a simple element like <BR>
(h4:p "a test para") in the REPL, and decide for yourself! I was perhaps <BR>
imprecise, since in SXML, writing a pattern in terms of the underlying data<BR>
structure is exactly what you do!<BR>
<BR>
I think that exposing the concrete data structure loses big when handling<BR>
XML namespaces, but it has benefits too. With SXML, I'm pretty sure one could, <BR>
off-the-shelf, use PLT's match. Oleg has said recently on his list that he uses<BR>
the match-case built into Bigloo.</FONT><FONT COLOR="#000000" style="BACKGROUND-COLOR: #ffffff" SIZE=2 FAMILY="SANSSERIF" FACE="Arial" LANG="0"><BR>
</FONT><FONT COLOR="#000000" style="BACKGROUND-COLOR: #ffffff" SIZE=2 FAMILY="SANSSERIF" FACE="Arial" LANG="0"><BR>
<BLOCKQUOTE TYPE=CITE style="BORDER-LEFT: #0000ff 2px solid; MARGIN-LEFT: 5px; MARGIN-RIGHT: 0px; PADDING-LEFT: 5px">Handling namespaces is one of the problems, I hit upon.<BR>
<BR>
> I had originally thought to extend either PLT's match or the Indiana match<BR>
> to use WebIt!'s constructor's as pattern for matching XML, but in the end<BR>
> I liked the syntax-rules style of patterns better.<BR>
<BR>
Shame, I think.</FONT><FONT COLOR="#000000" style="BACKGROUND-COLOR: #ffffff" SIZE=2 FAMILY="SANSSERIF" FACE="Arial" LANG="0"></BLOCKQUOTE><BR>
<BR>
</FONT><FONT COLOR="#000000" style="BACKGROUND-COLOR: #ffffff" SIZE=2 FAMILY="SANSSERIF" FACE="Arial" LANG="0">Certainly it has been pointed out to me that supporting the "dots" can be<BR>
"expensive", in a way that may be unexpected to the naive user. They work<BR>
well in macros, since the pattern matching "costs" are at expansion time anyway.<BR>
But for matching XML, these costs are at run time. The problem is basically<BR>
that it is hard (or not possible?) to avoid either constructing environment-like<BR>
structures or making a second pass over some of the source elements when<BR>
the "..." are present in a pattern and template. Granting that, I like the syntax-rules<BR>
style of matching quite a bit.<BR>
<BR>
At the same time, there will soon be an alternative available-- pattern<BR>
matching based on the regular expression pattern matching of XDuce.</FONT><FONT COLOR="#000000" style="BACKGROUND-COLOR: #ffffff" SIZE=2 FAMILY="SANSSERIF" FACE="Arial" LANG="0"><BR>
</FONT><FONT COLOR="#000000" style="BACKGROUND-COLOR: #ffffff" SIZE=2 FAMILY="SANSSERIF" FACE="Arial" LANG="0"><BR>
<BLOCKQUOTE TYPE=CITE style="BORDER-LEFT: #0000ff 2px solid; MARGIN-LEFT: 5px; MARGIN-RIGHT: 0px; PADDING-LEFT: 5px">Just checking: "auxilliary nodes"? I get a bit lost with jargon.</FONT><FONT COLOR="#000000" style="BACKGROUND-COLOR: #ffffff" SIZE=2 FAMILY="SANSSERIF" FACE="Arial" LANG="0"></BLOCKQUOTE><BR>
<BR>
The SXML spec v2.1 supports an auxillary list, intended (at least in part) to support<BR>
extensibility beyond the current XML infoset. The syntax is (@@ ...) and may be<BR>
included in *top* nodes, elements and attributes. An (@@ ...) node can include<BR>
a *namespaces* node-- this is where it now must be. But it can also include <BR>
"auxillary nodes", which are not currently defined in the SXML spec. Oleg and/or<BR>
Kirill have suggested a variety of possible uses for these-- such as a hash table for<BR>
quick access to attributes.<BR>
</FONT><FONT COLOR="#000000" style="BACKGROUND-COLOR: #ffffff" SIZE=2 FAMILY="SANSSERIF" FACE="Arial" LANG="0"><BR>
<BLOCKQUOTE TYPE=CITE style="BORDER-LEFT: #0000ff 2px solid; MARGIN-LEFT: 5px; MARGIN-RIGHT: 0px; PADDING-LEFT: 5px">> I could probably have used SXML as the underlying data type beneath WebIt!<BR>
> constructor's. But I really prefer working with structures instead.<BR>
<BR>
There we differ. I dislike them.</FONT><FONT COLOR="#000000" style="BACKGROUND-COLOR: #ffffff" SIZE=2 FAMILY="SANSSERIF" FACE="Arial" LANG="0"></BLOCKQUOTE><BR>
<BR>
Indeed, tastes differ. (And there are a few--though I think surprisingly few--technical<BR>
trade-offs between the two representations.)<BR>
</FONT><FONT COLOR="#000000" style="BACKGROUND-COLOR: #ffffff" SIZE=2 FAMILY="SANSSERIF" FACE="Arial" LANG="0"><BR>
<BLOCKQUOTE TYPE=CITE style="BORDER-LEFT: #0000ff 2px solid; MARGIN-LEFT: 5px; MARGIN-RIGHT: 0px; PADDING-LEFT: 5px">> As a "surface API" one of the benefits of WebIt! is it's treatment of XML <BR>
> namespaces. [...]<BR>
<BR>
Yes, it does seem so, but I need to know a bit more at how you match across<BR>
namespaces during the transformations before I'm convinced.</BLOCKQUOTE><BR>
<BR>
One can define a variety of "similar" tags, where constructors are mapped to expanded<BR>
names as follows:<BR>
a:link ==> link<BR>
link ==> {urn:place1}:link<BR>
b:link ==> {urn:another-place}:link<BR>
<BR>
The predicate a:link? will fail to match (link "some text") and (b:link "another element")-- <BR>
because both the predicates and the pattern matching system use the expanded names for<BR>
all tag comparisons.<BR>
<BR>
Note that these constructor names are Scheme identifier names. The use of "a:" or "b:" <BR>
(or indeed the absence of a prefix) is unrelated to whether this element is locally named<BR>
or is part of a namespace.<BR>
<BR>
In WebIt! one can define these types using define-element:<BR>
(define-element (a:link #f))<BR>
(define-element (link urn:place))<BR>
(define-element (b:link urn:another-place))<BR>
<BR>
In constructing the unqualified name for the new element, any "prefix" on the constructor<BR>
name is stripped, and for each of these, the root name will be "link".<BR>
<BR>
Note the #f in the definition of a:link. This is because the simplest syntax actually creates<BR>
a "generative type".<BR>
(define-type (c:link))<BR>
This actually generates a new element whose name is link, but which is part of a unique<BR>
namespace, at least until serialized. This allows the creation of tags which are used only<BR>
in intermediate values in a "stylesheet", while ensuring that such tags cannot clash with <BR>
the input or output of the "XML macros". At the same time, if such tags are intended to <BR>
be serialized, they are still printed as local names (in this case, as "link").<BR>
<BR>
Jim<BR>
</FONT></HTML>