[racket] DSL for multi-dimensional datasets?

From: Jay McCarthy (jay.mccarthy at gmail.com)
Date: Mon Nov 5 23:29:56 EST 2012

I would suggest looking into PADS as well:

http://www.padsproj.org/doc.html

On Mon, Nov 5, 2012 at 9:22 PM, Simon Haines
<simon.haines at con-amalgamate.net> wrote:
> As part of my work, I frequently have to 'shape' multi-dimensional datasets.
> This is reasonably easy to do in Racket and I'm thinking about pulling
> together some of the functions I use into a library. Before I do this
> though, I was wondering if there is any similar work I can build upon, or
> perhaps use to guide me.
>
> As an example of what I mean, I'll receive from a colleague a file like
> this:
>
> Date, Site, Total Alkalinity as CaCO3 (mg/L), Carbonate as CaCO3 (mg/L),
> 1-Nov-12, BH1, 120, <5
> 1-Nov-12, BH2, 180, <5
> 1-Nov-12, BH3, 160, <5
> 26-Oct-12, BH1, 150, <1
> 26-Oct-12, BH2, 165, 0
> 26-Oct-12, BH3, 180, <5
>
> (This is a laboratory analysis of water sampled from bore holes).
>
> This file is composed of two datasets (a set each of total alkalinity and
> carbonate), with shared dimensions of 'date' and 'site'. I'll often deal
> with files containing up to 80 datasets.
>
> More often than not, all I'll need to do is 'shape' these datasets into a
> format that can be pulled into a spreadsheet for further analysis/graphing.
> One example is:
>
> "", Total Alkalinity as CaCO3 (mg/L), Carbonate as CaCO3 (mg/L)
> BH1
> 1-Nov-12, 120, <5
> 26-Oct-12, 150, <1
> BH2
> 1-Nov-12, 180, <5
> 26-Oct-12, 165, 0
> BH3
> 1-Nov-12, 160, <5
> 26-Oct-12, 180, <5
>
> Another example:
>
> "", BH1, BH2, BH3
> Total Alkalinity as CaCO3 (mg/L)
> 1-Nov-12, 120, 180, 160
> 26-Oct-12, 150, 165, 180
> Carbonate as CaCO3 (mg/L)
> 1-Nov-12, <5, <5, <5
> 26-Oct-12, <1, 0, <5
>
> As you can see, the recursive nature of these reports makes them ideal for
> processing with Racket, and although it takes me a little while to get the
> format of a report right, I usually can add the report to my toolbox for
> whenever it's needed later.
>
> So I've started drafting what I think a good DSL for doing this type of task
> might be, something like:
> (define-dataset
>   (date (date 'dd-MM-yyyy'))
>   (site (text))
>   (parameter (text)) ...)
>
> (define-report example1
>   (columns (parameter ...))
>   (rows ((site) date)))
>
> I haven't worked out the details yet, and I'm not sure the above will work
> the way I want it to. But I've had a quick look at Microsoft's Scientific
> DataSet (http://sds.codeplex.com/), but it lacks the composability I'm used
> to with Racket. Is anyone aware of any similar work that does this, or that
> I could use as a guide?
>
> Thanks,
> Simon.
>
> ____________________
>   Racket Users list:
>   http://lists.racket-lang.org/users
>



-- 
Jay McCarthy <jay at cs.byu.edu>
Assistant Professor / Brigham Young University
http://faculty.cs.byu.edu/~jay

"The glory of God is Intelligence" - D&C 93

Posted on the users mailing list.