[racket] DSL for multi-dimensional datasets?

From: Simon Haines (simon.haines at con-amalgamate.net)
Date: Tue Nov 6 18:26:42 EST 2012

Thanks Jay for this link. This is a most comprehensive system that seems to
cover what I need, and a fair bit more as well. It will take me a while to
get through the manual but I hope there are nuggets of
insight contained within.
One thing surprises me though, it seems there are as many different
approaches to this problem as there are implementations. There doesn't even
seem to be a canonical language for describing multi-dimensional sets (that
I've discovered, at any rate). I'll keep at it and post progress to this
list. Thanks again,
Simon.




On 6 November 2012 15:29, Jay McCarthy <jay.mccarthy at gmail.com> wrote:

> I would suggest looking into PADS as well:
>
> http://www.padsproj.org/doc.html
>
> On Mon, Nov 5, 2012 at 9:22 PM, Simon Haines
> <simon.haines at con-amalgamate.net> wrote:
> > As part of my work, I frequently have to 'shape' multi-dimensional
> datasets.
> > This is reasonably easy to do in Racket and I'm thinking about pulling
> > together some of the functions I use into a library. Before I do this
> > though, I was wondering if there is any similar work I can build upon, or
> > perhaps use to guide me.
> >
> > As an example of what I mean, I'll receive from a colleague a file like
> > this:
> >
> > Date, Site, Total Alkalinity as CaCO3 (mg/L), Carbonate as CaCO3 (mg/L),
> > 1-Nov-12, BH1, 120, <5
> > 1-Nov-12, BH2, 180, <5
> > 1-Nov-12, BH3, 160, <5
> > 26-Oct-12, BH1, 150, <1
> > 26-Oct-12, BH2, 165, 0
> > 26-Oct-12, BH3, 180, <5
> >
> > (This is a laboratory analysis of water sampled from bore holes).
> >
> > This file is composed of two datasets (a set each of total alkalinity and
> > carbonate), with shared dimensions of 'date' and 'site'. I'll often deal
> > with files containing up to 80 datasets.
> >
> > More often than not, all I'll need to do is 'shape' these datasets into a
> > format that can be pulled into a spreadsheet for further
> analysis/graphing.
> > One example is:
> >
> > "", Total Alkalinity as CaCO3 (mg/L), Carbonate as CaCO3 (mg/L)
> > BH1
> > 1-Nov-12, 120, <5
> > 26-Oct-12, 150, <1
> > BH2
> > 1-Nov-12, 180, <5
> > 26-Oct-12, 165, 0
> > BH3
> > 1-Nov-12, 160, <5
> > 26-Oct-12, 180, <5
> >
> > Another example:
> >
> > "", BH1, BH2, BH3
> > Total Alkalinity as CaCO3 (mg/L)
> > 1-Nov-12, 120, 180, 160
> > 26-Oct-12, 150, 165, 180
> > Carbonate as CaCO3 (mg/L)
> > 1-Nov-12, <5, <5, <5
> > 26-Oct-12, <1, 0, <5
> >
> > As you can see, the recursive nature of these reports makes them ideal
> for
> > processing with Racket, and although it takes me a little while to get
> the
> > format of a report right, I usually can add the report to my toolbox for
> > whenever it's needed later.
> >
> > So I've started drafting what I think a good DSL for doing this type of
> task
> > might be, something like:
> > (define-dataset
> >   (date (date 'dd-MM-yyyy'))
> >   (site (text))
> >   (parameter (text)) ...)
> >
> > (define-report example1
> >   (columns (parameter ...))
> >   (rows ((site) date)))
> >
> > I haven't worked out the details yet, and I'm not sure the above will
> work
> > the way I want it to. But I've had a quick look at Microsoft's Scientific
> > DataSet (http://sds.codeplex.com/), but it lacks the composability I'm
> used
> > to with Racket. Is anyone aware of any similar work that does this, or
> that
> > I could use as a guide?
> >
> > Thanks,
> > Simon.
> >
> > ____________________
> >   Racket Users list:
> >   http://lists.racket-lang.org/users
> >
>
>
>
> --
> Jay McCarthy <jay at cs.byu.edu>
> Assistant Professor / Brigham Young University
> http://faculty.cs.byu.edu/~jay
>
> "The glory of God is Intelligence" - D&C 93
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20121107/bd7d5df7/attachment-0001.html>

Posted on the users mailing list.