> I'm trying to port some TCL web-scraping code to PLT-Scheme as a way 
> to gain
> more understanding of Scheme.
> The TCL code fetches a page on the web, looks for some tidbits in the 
> code and then acts on the tidbits, a very standard behaviour for a
> web-scraping script.


> Or am I doing this all wrong ? Maybe I should read the HTML as an Xexp 
> and use
> the underlying structure instead of parsing a flat string. (Some of the
> tidbits I parse for are the external links in the HTML page)


Others have already pointed in the right direction, but I just wanted 
to bring out this point:  HTML is a string representation of structured 
data.  Recover the structured data _first_, then operate on that.  It's 
much much easer to reason correctly about structured data than it is 
about sequences of characters.

Note that I'm not insisting on X-expressions; you might well be happier 
with other structured representations--say, sxml, which is now bundled 
with the intermediate releases of Dr/MzScheme.

john clements

