<div class="gmail_quote"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">I normally just lurk on this list (I'm only a novice with Scheme), but<br>
I figured I'd crawl out of hiding for this topic. (I tried to post a<br>
similar message last night but didn't realize I couldn't post through<br>
Google groups without being subscribed directly to the list.)</blockquote><div><br>Thanks for joining in. <br><br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
><br>
> [...] arrays have an order, which may be 'row or 'column.<br>
><br>
<br>
When you implement an array transpose for more than two dimensions<br>
(basically swapping two axes), and if you don't require a copy of the<br>
array, you can easily end up with arrays that are neither 'row nor<br>
'column major.</blockquote><div><br>There are operations that require a copy and there is no way I know of around it.<br> <br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Similarly, you'll probably want to support striding or reversing an<br>
array without making a copy under the hood. For example, to pull out<br>
a sub-array, you might create a "take" operator like such:<br>
<br>
(take some-array #:start 3000 #:stop 1000 #:step -2) <br></blockquote><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br>
The actual syntax could be part of your array-ref function, and the<br>
actual parameters could be "to" and "from" and "by" etc.., but in any<br>
case the result will be a copy or it won't be either row or column<br>
major. Also note that a one dimensional array is *both* row and<br>
column major (Fortran and C will both be happy with it).</blockquote><div><br>I was thinking a syntax like (<start> [<stop> [<step>]]) do (array-ref some-array '((* * 3) 4)) would be the 5th column of every 4th row of some 2d array. I don't have a problem with negative steps. I'll have to play around and see what negative strides etc will work out, but I'm sure some copies may be required. Yes, for 1-dimensional arrays it doesn't matter what the order is - and it is its own transpose.<br>
<br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
One thing that always frustrated me about NumPy is that you don't<br>
really know whether you've gotten a copy or a view for many<br>
operations. Sometimes it is safe to modify the copy, but other times<br>
you're trashing the original data in a view, and the same piece of<br>
code can do either based on runtime arguments to the function you're<br>
calling. I believe one correct thing to do is to implement copy-on-<br>
write for any shape changing operation, but that's trickier. Treating<br>
arrays as immutable is attractive (especially in a functional<br>
programming language), and an efficient implementation for that is<br>
probably even more complicated. Noel Welsh mentioned the SAC<br>
language, and it looks like they would be a good place from which to<br>
get ideas.</blockquote><div><br>I am leaning toward immutability, so it shouldn't matter if it is a copy or not. <br><br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Another frustrating thing in NumPy is the topic of zero-dimensional<br>
arrays (scalars) and how they integrate with the rest of the language<br>
and libraries. Does subscripting an N-dimensional array by N<br>
subscripts give you a Scheme numeric value, or a zero dimensional<br>
array? If it's a standard Scheme value, can you broadcast (or "new<br>
axis") a Scheme scalar value back to an array. Will it change it's<br>
type when you do this? Can you subscript with zero subscripts?</blockquote><div><br>My current implementation has a schizophrenic array-ref, if a reference is fully qualified (i.e., values specified for all the dimensions), it returns a Scheme scalar value, otherwise it returns a new array object to the referenced subarray (slice). It subarray shares the parent array's data if possible, otherwise a new data vector is made for the subarray. I will use broadcasting - so a scalar can look like any shape array. <br>
<br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
It looks like this is also covered in the SAC language, and they seem<br>
to have a really nice way to do things, but to completely fit that way<br>
into Scheme might require that Scheme think of it's numeric values as<br>
zero dimensional arrays. A special PLT language could integrate<br>
arrays into the numeric tower, but that's probably not what you had in<br>
mind for your library. There is an unfortunate compromise to make in<br>
there somewhere.</blockquote><div><br>As a library, I can't control the Scheme side of things. [But, I've found the PLT Scheme implementators like to do the 'right thing' and they do listen.] So, comprimises at the boundary of the array collection and Scheme will be required.<br>
</div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br>
<br>
<br>
Similarly, when you do an element-wise multiplication of two arrays,<br>
you'll probably promote to a common super type (multiplying "signed 16<br>
bit" by "32 bit floating point" should probably give you "32 bit<br>
floating point"). What should you promote to when you multiply an<br>
array by a Scheme scalar floating point or integer value? Will you<br>
support in-place multiplication, and will it give you the same type?</blockquote><div><br>One of the comprimises I am making is that all computations will be done in native Scheme code using the Scheme numeric tower. I'll only coerce things at the array storage level. Indeed, the default array uses Scheme objects and the array library won't do any coercions at all. The operations don't even have to be numeric.<br>
</div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br>
<br>
There are lots of questions like that, and different people will give<br>
you different answers about the "right" way to do things... I have my<br>
own opinions, but they aren't more authoritative than any other.</blockquote><div><br>Let's try shared authority and collaboration at this point.<br> </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
><br><div class="im">
> The types are object, u8, u16, u32, u64, s8, s16, s32, s64, f32, f64, c64, and c128.<br>
><br>
<br>
</div>In addition to the types you list above, I would personally benefit<br>
from complex integer formats. We use NumPy at work, and it is very<br>
inconvenient that we have to declare an N-by-2 array (translated to<br>
your syntax):<br>
<br>
(make-array (list N 2) #:type s16)<br>
<br>
When what we really want is a one dimensional array of N complex 16<br>
bit integers:<br>
<br>
(make-array (list N) #:type cs16)</blockquote><div><br>Most of the type names above (and certainly the style of them) are from the SRFI 4 library. One of my next questions to the list was whether to keep this or use the C-vector style. I like this one for conciseness. I also like your suggestion below for cf32 and cf64 instead of c64 and c128. It does allow extending to things like cs32, for example.<br>
<br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
If your library takes off, my guess is that someone will ask you for<br>
quaternions and arrays of structures (RGBA pixels for instance)<br>
too. :-)</blockquote><div><br>I was already planning on quaternions (since I do a lot of simulation work). Of course you can have arrays of anything you want using object. We should make it extensible by exposing the vtype abstraction in a useful manner.<br>
<br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
(As a silly style suggestion, I think you should call your complex<br>
floating point types cf32 and cf64 instead of c64 and c128.)</blockquote><div><br>I rather like this and will do that now.<br><br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
I think other items will eventually crop up too... For instance,<br>
you'll probably want to be able to read data off disk or serialize it<br>
across a network. In this case, you'll need to keep track of the<br>
endianess. (Thankfully VAX floating point is mostly dead. :-)</blockquote><div><br>I'll probably extend the packed-binary package (<a href="http://planet.plt-scheme.org/display.ss?package=packed-binary.plt&owner=williams">http://planet.plt-scheme.org/display.ss?package=packed-binary.plt&owner=williams</a>) for that. <br>
<br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
This thread has already started discussing how the memory for typed<br>
arrays could be handled specially with respect to the garbage<br>
collector. Someone will want to process huge arrays with memory<br>
mapped files (larger than can possibly fit in RAM). I think you'll<br>
find that eventually you want to implement your arrays on top of a<br>
generic "buffer of bytes" type like R6RS bytevectors.</blockquote><div><br>But, that is one for the future, I think. <br> <br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
I sincerely hope that none of this discourages you! I just wanted to<br>
point out a few details I think you'll run into sooner or later.<br>
Maybe that will spare you some of the backwards compatibility baggage<br>
that other array libraries are stuck with.</blockquote><div><br>I was aware of most of them, but it helps having someone to keep me honest. <br></div></div><br>