[plt-scheme] SCRIPT elements & evaporating CDATA sections [was RE: html in servlets]

From: Anton van Straaten (anton at appsolutions.com)
Date: Sun May 16 22:25:19 EDT 2004

Neil W. Van Dyke wrote:

> Anton van Straaten <anton at appsolutions.com> writes at 20:43
> 16-May-2004 -0400:
> [...]
> > This still isn't a perfect solution, since e.g. read-xml can still
> > fail on ordinary HTML with script, as opposed to XHTML.  However,
> > fixing it more thoroughly would require more intelligence in xml.ss;
>
> I don't know if this would be any help, but... on my TODO list is an
> "shtml->plt-xml" (SHTML is the new variant of SXML used by HtmlPrag to
> represent HTML).  HtmlPrag tries to parse "script" elements in the way
> that popular Web browsers do, which may or may not be what you want.  If
> anyone has a near-term need for that, let me know, and I'll bump up the
> priority.

It's not a pressing need, for me.  I've done some projects with PLT servlets
serving HTML with client-side Javascript, and was surprised I hadn't noticed
this problem.  However, that was because I usually use external script
files, mainly because embedding a lot of Javascript in Scheme source breaks
up those clean s-exp lines.

Having dug into it far enough to have developed an opinion, I thought it was
worth mentioning the point about the behavior of CDATA, since that seems
like a bit of an inconsistency in the current xml.ss implementation.

> Longer-term, I want to find an easier way to use SXML with PLT servlets
> and other PLT XML-related tools, since there's a wealth of work being
> done with SXML tools.

I agree with that.

> [...]
> > treat elements like SCRIPT specially in the HTML context.  The best
> > way to do that might be DTD-directed, which is of course not so
> > trivial.
> [...]
>
> I think the core problem is that de facto HTML bears only superficial
> resemblance to XML, and they're both surprisingly complicated beasts. :)

Agreed, although XHTML is a pretty good hybrid.  XHTML works very well with
something like PLT's xml.ss, which relies on pure XML without worrying about
DTD's.  That's why I thought "fixing" the CDATA issue might be a good
compromise, since it would support conversion from XHTML to
browser-compatible HTML, even with script code present.

Anton



Posted on the users mailing list.