Fwd: [plt-scheme] Recommendations for parsing HTML

From: John Clements (clements at brinckerhoff.org)
Date: Thu Dec 4 04:39:03 EST 2008

> From: John Clements <clements at brinckerhoff.org>
> Date: December 4, 2008 12:46:55 AM PST
> To: Patrick Lozzi <patricklozzi at gmail.com>
> Cc: PLT-list List <plt-scheme at list.cs.brown.edu>, neil at neilvandyke.org
> Subject: Re: [plt-scheme] Recommendations for parsing HTML
> On Dec 3, 2008, at 9:57 PM, Patrick Lozzi wrote:
>> It appears I'm really at a loss without the planet's htmlprag  
>> library, I tried to set the file to R5RS but it wouldn't allow me  
>> to use the planet's htmlprag library as it came up with "reference  
>> to undefined identifier: require"... mustn't the file be set to  
>> module in order to use require?  That's the impression I'm  
>> getting.  In other words, whenever I received this error in the  
>> past, I realized the current file wasn't set to module, so simply  
>> setting it to module corrected this error... but if I set it back  
>> to module, I'm back at square one with the mutable cons cells  
>> problem that plagues > v4 versions combined with htmlprag.
> Because... well, because I had so many *other* things that I should  
> have been doing instead, I took a crack at updating htmlprag to  
> work with 4.0.
> Let me just say... BLECCH!  This is not a comment about this code,  
> per se; it's just that trying to get a handle on the flow of values  
> in a world with mutable pairs is a horrible awful nightmare.
> I tried not to let the mutable pairs bleed into everything too  
> much, but I wouldn't claim the result is anything but a hack job.   
> Perhaps it will convince Neil to take a look at it himself!
> Anyhow, it works, in the sense that it passes the built-in 146 tests.
> For what it's worth, this would have been a twenty-hour project  
> rather than a two-hour project if it weren't for that test suite.
> Attached please find an updated planet package.
> If Neil likes it, he can update the planet server. Until then, you  
> can inject it yourself by putting it in /tmp, say, and running
> planet fileinject neil /tmp/htmlprag.plt 1 4
> Finally, you'd then require it by saying something like:
> (require (planet neil/htmlprag:1:4/htmlprag))
> Okay, *now* someone can tell me that there's a perfectly good  
> alternative to htmlprag.
> All the best,
> John Clements
