Fwd: [plt-scheme] Recommendations for parsing HTML

From: John Clements (clements at brinckerhoff.org)
Date: Thu Dec 4 04:39:03 EST 2008

Whoops... bigger than the 50K limit with the attachment.  If my  
original message doesn't get approved, let me know and I'll send you  
the .plt file separately.

John Clements

Begin forwarded message:

> From: John Clements <clements at brinckerhoff.org>
> Date: December 4, 2008 12:46:55 AM PST
> To: Patrick Lozzi <patricklozzi at gmail.com>
> Cc: PLT-list List <plt-scheme at list.cs.brown.edu>, neil at neilvandyke.org
> Subject: Re: [plt-scheme] Recommendations for parsing HTML
>
>
> On Dec 3, 2008, at 9:57 PM, Patrick Lozzi wrote:
>
>> It appears I'm really at a loss without the planet's htmlprag  
>> library, I tried to set the file to R5RS but it wouldn't allow me  
>> to use the planet's htmlprag library as it came up with "reference  
>> to undefined identifier: require"... mustn't the file be set to  
>> module in order to use require?  That's the impression I'm  
>> getting.  In other words, whenever I received this error in the  
>> past, I realized the current file wasn't set to module, so simply  
>> setting it to module corrected this error... but if I set it back  
>> to module, I'm back at square one with the mutable cons cells  
>> problem that plagues > v4 versions combined with htmlprag.
>
> Because... well, because I had so many *other* things that I should  
> have been doing instead, I took a crack at updating htmlprag to  
> work with 4.0.
>
> Let me just say... BLECCH!  This is not a comment about this code,  
> per se; it's just that trying to get a handle on the flow of values  
> in a world with mutable pairs is a horrible awful nightmare.
>
> I tried not to let the mutable pairs bleed into everything too  
> much, but I wouldn't claim the result is anything but a hack job.   
> Perhaps it will convince Neil to take a look at it himself!
>
> Anyhow, it works, in the sense that it passes the built-in 146 tests.
>
> For what it's worth, this would have been a twenty-hour project  
> rather than a two-hour project if it weren't for that test suite.
>
> Attached please find an updated planet package.
>
> If Neil likes it, he can update the planet server. Until then, you  
> can inject it yourself by putting it in /tmp, say, and running
>
> planet fileinject neil /tmp/htmlprag.plt 1 4
>
> Finally, you'd then require it by saying something like:
>
> (require (planet neil/htmlprag:1:4/htmlprag))
>
> Okay, *now* someone can tell me that there's a perfectly good  
> alternative to htmlprag.
>
> All the best,
>
> John Clements
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2484 bytes
Desc: not available
URL: <http://lists.racket-lang.org/users/archive/attachments/20081204/df8ed886/attachment.p7s>

Posted on the users mailing list.