<div dir="ltr">Even if it isn't that large, you may benefit from Pads, as they have a nice way to describe the data. (Once you get it parsed, tho, they you could come back to Racket if you wanted at that point.)<div style>
<br>Robby</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, May 9, 2013 at 3:07 PM, Matthias Felleisen <span dir="ltr"><<a href="mailto:matthias@ccs.neu.edu" target="_blank">matthias@ccs.neu.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
If you are talking about really large, really not quite properly formatted data sets,<br>
you want to look up the PADS project at<br>
<br>
<a href="http://www.padsproj.org" target="_blank">http://www.padsproj.org</a><br>
<br>
It's a product from ATT Labs (which is a Bell Labbs 'baby') and they apparently used it on their billing data.<br>
<br>
If you are looking at a few megabytes, any of our parser tools will do perhaps starting with 'parser-tools/'.<br>
<span class="HOEnZb"><font color="#888888"><br>
-- Matthias<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
<br>
<br>
<br>
<br>
<br>
<br>
On May 9, 2013, at 3:47 PM, David Vanderson <<a href="mailto:david.vanderson@gmail.com">david.vanderson@gmail.com</a>> wrote:<br>
<br>
> I've got character-based invoices from old systems that look roughly like (but much bigger):<br>
><br>
> DATE DESC CREDIT DEBIT<br>
> 01/01/2013 SERVICES $1234.50<br>
> 01/01/2013 PAYMENT $1000.00<br>
><br>
> BALANCE $234.50<br>
><br>
><br>
> I don't know exactly how they're formatted, so I'm working from examples. My initial plan was to hand-code a dumb parser with regular expressions, but I suspect there's a better way. In particular, it'd be nice to have some leeway as to exact positions of data, and hopefully some nice error reporting and recovery abilities.<br>
><br>
> Can anyone point me towards a parsing technique that would lend itself to this problem?<br>
><br>
> Thanks,<br>
> Dave<br>
> ____________________<br>
> Racket Users list:<br>
> <a href="http://lists.racket-lang.org/users" target="_blank">http://lists.racket-lang.org/users</a><br>
<br>
<br>
____________________<br>
Racket Users list:<br>
<a href="http://lists.racket-lang.org/users" target="_blank">http://lists.racket-lang.org/users</a><br>
</div></div></blockquote></div><br></div>