[plt-scheme] Need help with struct:html

From: Brian Beckman (bc.beckman at gmail.com)
Date: Thu Sep 23 19:05:58 EDT 2004

Hello, all -- I've successfully read a (rather large) block of HTML
via code that looks like this:

(require (lib "html.ss" "html"))

(define s (read-html (open-input-file "C:\\Documents and
Settings\\bcbeckman\\My Documents\\testData\\myfile.html")))

(define v (struct->vector ssccSep2004))

(define x 
  (read-html-as-xml
   (open-input-file "C:\\Documents and Settings\\bcbeckman\\My
Documents\\testData\\myfile.html")))

The trouble is that I can't figure out how to get at any of the
information.  If I just try to print s, v, or x, I get the unhelpful
#<struct:html>, #2(struct:html ...), and (#<struct:pcdata>
#<struct:pcdata> #<struct:element> #<struct:pcdata>), respectively.
Looking at the html library documentation in HelpDesk gives me the
following...

--- begin quote ---

Pcdata, Entity, and Attribute are defined in the XML documentation.
 
> Html-content = Html-element | Pcdata | Entity
 
> Html-element = any of the structures below which all inherit from
  (define-struct html-element (attributes)).  Any html tag that may
  include content also inherits from
  (define-struct (html-full struct:html-element) (content))
  without adding any additional fields.
 
A Html is
(make-html (listof Attribute) (listof Contents-of-html))
 
A Contents-of-html is either
  - Body
  - Head
 
...

--- end quote ---

so, I try things like 

(html-Html-content s)
(html-content s)
(Html-content s)
(html-element s)
(Html-element s)
(html-Html-element s)
(html-Body s)
(html-Contents-of-html s)
(Contents-of-html s)

and so on ... getting nothing but errors of various kinds, mostly
"undefined identifier".

Obviously, I am very confused.  My uneducated guesswork has gotten me
nowhere.  Any hints for me?


Posted on the users mailing list.