[plt-scheme] xml library: getting DTD information
I'm having a little difficulty with the XML library, and I'm not sure
whether this is a bug in the library or in my XML file.
DrScheme 3.99.0.25-svn19may2008, Mac OS 10.5.2.
I have this saved as /Users/cobbe/test.xml:
<?xml version="1.1" encoding="UTF-8"?>
<!DOCTYPE keyboard SYSTEM "file://localhost/System/Library/DTDs/KeyboardLayout.dtd">
<keyboard group="126" id="-2" name="US Extended" maxout="3">
</keyboard>
I don't know if this is relevant, but
/System/Library/DTDs/KeyboardLayout.dtd exists on my machine. (Also
probably irrelevant: the XML file above doesn't match the DTD; it's
missing several required elements inside the 'keyboard' tag. But I get the
same results when I use the full file, of which the above example is a
small excerpt.)
When I read the XML document using the built-in xml library, I'm able to
see everything *except* the DOCTYPE:
> (define p (open-input-file "/Users/cobbe/test.xml"))
> (require xml)
> (define doc (read-xml p))
> (close-input-port p)
> (document-misc doc)
()
This is slightly odd; according to the docs, `document-misc' is supposed to
return a comment or a pcdata, not a list. Don't know if this is relevant,
though.
> (define prolog (document-prolog doc))
> (prolog-misc prolog)
(#<pi>)
> (p-i-target-name (car (prolog-misc prolog)))
xml
> (p-i-instruction (car (prolog-misc prolog)))
"version=\"1.1\" encoding=\"UTF-8\""
This next is the surprising bit; I'd expect to get some representation of
the DOCTYPE line here:
> (prolog-dtd prolog)
#f
I'm not familiar enough with XML to know what could appear in the misc2
slot, so I don't know if this is the Right Thing:
> (prolog-misc2 prolog)
()
And the rest is what I'd expect:
> (define elt (document-element doc))
> (element-name elt)
keyboard
> (element-attributes elt)
(#<attribute> #<attribute> #<attribute> #<attribute>)
> (element-content elt)
(#<pcdata>)
> (pcdata-string (car (element-content elt)))
"\n"
> (map (lambda (a) (list (attribute-name a) (attribute-value a)))
(element-attributes elt))
((group "126") (id "-2") (maxout "3") (name "US Extended"))
Now, I'm no XML expert, so I'm quite prepared to believe that the XML file
is ill-formed. I did check the W3C's XML tutorial, though, and the DOCTYPE
declaration does appear to match the examples they give, so it looks good
to me.
I did try changing the DOCTYPE line to use just an absolute pathname rather
than a URI, and then just the filename "KeyboardLayout.dtd" (which I copied
into the same directory as the XML file) but this didn't change anything.
Is this a bug in the XML library, or did I miss something?
Thanks,
Richard