Thanks Asumu for these links. Although the code in the paper is confusing because I'm not familiar with R, it has given me a good insight: datasets need to be described as dimensions and variables. I think the library presented in the paper conflates the structure of the data as read (in a csv file, say) with the logical structure of the dataset as a whole. (I may be wrong on this point, but that is my reading.) I think these concepts should be separated and similarly the the structure of a report is separate again.<div>
However, having given it a little thought after reading the paper, I think there's a good way forward by describing datasets as dimensions and variables, and then incorporating relational algebra primitives, particularly ó, ð and G (group by). I'll brush up on the Codd model and see if that gives me any further insights.</div>
<div>Thanks again,</div><div>Simon.<br><div><br></div></div><div><br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On 6 November 2012 15:32, Asumu Takikawa <span dir="ltr"><<a href="mailto:asumu@ccs.neu.edu" target="_blank">asumu@ccs.neu.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">On 2012-11-06 15:22:49 +1100, Simon Haines wrote:<br>
> As part of my work, I frequently have to 'shape' multi-dimensional<br>
> datasets. This is reasonably easy to do in Racket and I'm thinking<br>
> about pulling together some of the functions I use into a library.<br>
> Before I do this though, I was wondering if there is any similar work I<br>
> can build upon, or perhaps use to guide me.<br>
><br>
</div>> [...]<br>
<div class="im">><br>
> I haven't worked out the details yet, and I'm not sure the above will<br>
> work the way I want it to. But I've had a quick look at Microsoft's<br>
</div>> Scientific DataSet ([1]<a href="http://sds.codeplex.com/" target="_blank">http://sds.codeplex.com/</a>), but it lacks the<br>
<div class="im">> composability I'm used to with Racket. Is anyone aware of any similar<br>
> work that does this, or that I could use as a guide?<br>
<br>
</div>I don't know about Racket, but have you seen the 'reshape' library in R?<br>
It's very flexible and is probably one of the state of the art designs<br>
in this space.<br>
<br>
Here's a journal article describing its design:<br>
<a href="http://www.jstatsoft.org/v21/i12/paper" target="_blank">http://www.jstatsoft.org/v21/i12/paper</a><br>
<br>
and its website:<br>
<a href="http://had.co.nz/reshape/" target="_blank">http://had.co.nz/reshape/</a><br>
<br>
Cheers,<br>
Asumu<br>
</blockquote></div><br></div>