[plt-scheme] xml.ss and DTDs

From: Matthias Felleisen (matthias at ccs.neu.edu)
Date: Thu Sep 18 12:09:12 EDT 2008

Go felix go! Submit a bug report :-)

On Sep 18, 2008, at 12:02 PM, Jay McCarthy wrote:

> The xml library will print back the DTD if it is in the prolog struct
> of the document struct. However, there is no parser for DTDs, so
> read-xml will always have #f in the dtd field of the prolog struct.
> This shows up on xml/private/reader.ss:30. I imagine that the skip-dtd
> function therein might (roughly) know how to parse them.
>
> Now, as far as "Is this intentional?" I'm responsible for the XML
> library, because someone who was responsible for it once was also
> responsible for the web server. I've fixed maybe one bug in my tenure
> and never touched it otherwise. If you submit a bug, I will either
> update the documentation or try to write the parser.
>
> Jay
>
> On Thu, Sep 18, 2008 at 9:50 AM, Felix Klock's PLT scheme proxy
> <pltscheme at pnkfx.org> wrote:
>> PLTers-
>>
>> From the documentation for the xml.ss library:
>>
>>> "The xml library does not provides [sic] Document Type  
>>> Declaration (DTD)
>>> processing, validation, expanding user-defined entities, or reading
>>> user-defined entities in attributes."
>>
>>
>> Is the phrase "DTD processing" meant to include functionality such as
>> "reading the DOCTYPE declaration given in the input file"?
>>
>> From what I can tell from the observable behavior and the source  
>> text of
>> xml.ss, the XML parsing skips over DOCTYPE declarations in the input.
>>
>> If the silent dropping of the DTD declaration is intentional, I  
>> think the
>> documentation should be clearer, and I will file a bug report  
>> against the
>> docs.  If it is not intentional (or a feature waiting to be  
>> implemented),
>> then I will file a bug report against the source code.  Right now  
>> I cannot
>> tell where the bug report belongs.
>>
>> -Felix
>>
>> p.s. Here's an example illustrating what I'm talking about:
>>
>> #lang scheme
>> (require (lib "xml.ss" "xml"))
>>
>> (define source-html-string #<<END
>> <!DOCTYPE html PUBLIC
>>  "-//W3C//DTD XHTML 1.0 Transitional//EN"
>>  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
>> <html xmlns="http://www.w3.org/1999/xhtml">         </html>
>> END
>>  )
>>
>> (define source-document (read-xml (open-input-string source-html- 
>> string)))
>>
>> (write-xml source-document)
>> (newline)
>> ;; prints:
>> ;; <html xmlns="http://www.w3.org/1999/xhtml">         </html>
>> ;; and so we've lost information from the input
>>
>> _________________________________________________
>>  For list-related administrative tasks:
>>  http://list.cs.brown.edu/mailman/listinfo/plt-scheme
>>
>
>
>
> -- 
> Jay McCarthy <jay at cs.byu.edu>
> Assistant Professor / Brigham Young University
> http://jay.teammccarthy.org
>
> "The glory of God is Intelligence" - D&C 93
> _________________________________________________
>   For list-related administrative tasks:
>   http://list.cs.brown.edu/mailman/listinfo/plt-scheme



Posted on the users mailing list.