reading html as sxml (Was: [plt-scheme] Re: plt-scheme Digest, Vol 8, Issue 1)

From: Neil Van Dyke (neil at neilvandyke.org)
Date: Sat Apr 1 20:05:36 EST 2006

One way to read the HTML from Google's home page into SXML that can be
processed by the SXML tools Dmitry Lizorkin distributes is using
HtmlPrag:

    (require (lib "url.ss" "net")
             (planet "htmlprag.ss" ("neil" "htmlprag.plt" 1 3)))

    (html->sxml (get-pure-port (string->url "http://www.google.com/")))

I'm not familiar with "read-html-as-xml".

Neil

> From: geb a <geb_a at yahoo.com>
> To: plt-scheme at list.cs.brown.edu
> Subject: [plt-scheme] Re: plt-scheme Digest, Vol 8, Issue 1
> Date: Sat, 1 Apr 2006 09:13:03 -0800 (PST)
> 
> Good morning...
> 
> Has anyone used Lizorkin's SSAX and SXML packages?  I
> have been trying to get them to display google's web
> site for a bit now and for one reason or another keep
> running into road blocks.
> 
> 
> (sxml:document 
>    (read-html-as-xml 
>       (get-pure-port 
>          (string->url "http://www.google.com"))))
> 
> This is the latest try
> 
> string-prefix?: Non-string value (#<struct:pcdata>
> #<struct:element>)
> 
> Thanks very much for your help!


Posted on the users mailing list.