[plt-scheme] xml.ss and DTDs
Go felix go! Submit a bug report :-)
On Sep 18, 2008, at 12:02 PM, Jay McCarthy wrote:
> The xml library will print back the DTD if it is in the prolog struct
> of the document struct. However, there is no parser for DTDs, so
> read-xml will always have #f in the dtd field of the prolog struct.
> This shows up on xml/private/reader.ss:30. I imagine that the skip-dtd
> function therein might (roughly) know how to parse them.
>
> Now, as far as "Is this intentional?" I'm responsible for the XML
> library, because someone who was responsible for it once was also
> responsible for the web server. I've fixed maybe one bug in my tenure
> and never touched it otherwise. If you submit a bug, I will either
> update the documentation or try to write the parser.
>
> Jay
>
> On Thu, Sep 18, 2008 at 9:50 AM, Felix Klock's PLT scheme proxy
> <pltscheme at pnkfx.org> wrote:
>> PLTers-
>>
>> From the documentation for the xml.ss library:
>>
>>> "The xml library does not provides [sic] Document Type
>>> Declaration (DTD)
>>> processing, validation, expanding user-defined entities, or reading
>>> user-defined entities in attributes."
>>
>>
>> Is the phrase "DTD processing" meant to include functionality such as
>> "reading the DOCTYPE declaration given in the input file"?
>>
>> From what I can tell from the observable behavior and the source
>> text of
>> xml.ss, the XML parsing skips over DOCTYPE declarations in the input.
>>
>> If the silent dropping of the DTD declaration is intentional, I
>> think the
>> documentation should be clearer, and I will file a bug report
>> against the
>> docs. If it is not intentional (or a feature waiting to be
>> implemented),
>> then I will file a bug report against the source code. Right now
>> I cannot
>> tell where the bug report belongs.
>>
>> -Felix
>>
>> p.s. Here's an example illustrating what I'm talking about:
>>
>> #lang scheme
>> (require (lib "xml.ss" "xml"))
>>
>> (define source-html-string #<<END
>> <!DOCTYPE html PUBLIC
>> "-//W3C//DTD XHTML 1.0 Transitional//EN"
>> "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
>> <html xmlns="http://www.w3.org/1999/xhtml"> </html>
>> END
>> )
>>
>> (define source-document (read-xml (open-input-string source-html-
>> string)))
>>
>> (write-xml source-document)
>> (newline)
>> ;; prints:
>> ;; <html xmlns="http://www.w3.org/1999/xhtml"> </html>
>> ;; and so we've lost information from the input
>>
>> _________________________________________________
>> For list-related administrative tasks:
>> http://list.cs.brown.edu/mailman/listinfo/plt-scheme
>>
>
>
>
> --
> Jay McCarthy <jay at cs.byu.edu>
> Assistant Professor / Brigham Young University
> http://jay.teammccarthy.org
>
> "The glory of God is Intelligence" - D&C 93
> _________________________________________________
> For list-related administrative tasks:
> http://list.cs.brown.edu/mailman/listinfo/plt-scheme