[plt-scheme] feedback on monadic parser combinators implementation/translation trial and application

From: keydana at gmx.de (keydana at gmx.de)
Date: Wed Jan 6 13:01:36 EST 2010

Previous message: [plt-scheme] Re: bzlib/parseq.plt
Next message: [plt-scheme] feedback on monadic parser combinators implementation/translation trial and application
Messages sorted by: [date] [thread] [subject] [author]

Hi all,

I'd be very grateful for any feedback on an attempt to "translate" some monadic parser combinators from Hutton & Meijer's "Monadic Parser Combinators" article and use them (plus some own ones) for a simple but perhaps uncommon use case
(I've originally been pointed to this article by YC's blog, but I've not had a look at his newly published library yet, it's just a coincidence that I happen to finish now...)

I'll attach the 2 files, the first containing the translation attempt and other parsers, the second the specific application.

For the first part, the main question for me was how to represent the parse results, which in the Hutton & Meijer article are lists of tuples. I chose to have lists of a struct parse-result with fields "value" and "remaining".
Another question during coding was whether I'd encounter problems by not using a lazy language - I didn't, but I'm still wondering whether I missed something :-;

For the second part, my input is a DTA file (see http://de.wikipedia.org/wiki/Datenträgeraustauschverfahren) - it's a file format for transmitting payment transactions (debits, credits) between banks. Unfortunately it's a national format,
but its structure is the following:

1 physical DTA file = 1-n logical DTA files

DTA = part A + listof part C + part E

part C = sequence of defined fields + 0-n extensions (contains information about one payment transaction)

extension = sequence of defined fields

part A =  sequence of defined fields (contains general information about the following C records)

part E =  sequence of defined fields (contains checksum information about the preceding C records)

For this specific application, I wanted to parse an input file and end up with an output suitable for loading the transactions into a program like Excel (for an imaginary accountant to use).
So I didn't want to get out a string, but  structured and verified output, which I did by possibly "misusing" the bind operator to perform other operations than parsing:
For every DTA file part, I get its fields and build up a struct from them; when I have parts A,C and E together, I also check the checksums using bind, and if all's okay I write the structure to a text file
in a csv-like manner.
It works, but I guess it's in fact a misuse of the parser type, and in a typed language it would not work as the value part of parse-result changes its type here (in the pure parser part, it's a string, here it can 
become anything).

I'd be very interested in opinions about this, although of course it must seem like a strange application in this list (just started it to try to understand the article, and then thought of some application)

Many thanks and best wishes for 2010
Sigrid
-------------- next part --------------
A non-text attachment was scrubbed...
Name: monadic-parser.ss
Type: application/octet-stream
Size: 6813 bytes
Desc: not available
URL: <http://lists.racket-lang.org/users/archive/attachments/20100106/71647982/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dta-parser.ss
Type: application/octet-stream
Size: 6810 bytes
Desc: not available
URL: <http://lists.racket-lang.org/users/archive/attachments/20100106/71647982/attachment-0001.obj>

Posted on the users mailing list.

Previous message: [plt-scheme] Re: bzlib/parseq.plt
Next message: [plt-scheme] feedback on monadic parser combinators implementation/translation trial and application
Messages sorted by: [date] [thread] [subject] [author]