FWIW the <a href="http://software-carpentry.org">http://software-carpentry.org</a> team have done some excellent work on teaching scientists how to code. 'Data-carpentry' seems to be topical for them (looking at the twitter feed).<div>
<br></div><div><span></span><br><div><div><br>On Tuesday, 13 May 2014, Konrad Hinsen <<a href="mailto:konrad.hinsen@fastmail.net">konrad.hinsen@fastmail.net</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Matthias Felleisen writes:<br>
<br>
> > Note however that I didn't look at performance, which is not<br>
> > really important for most of what I do.<br>
><br>
> In hindsight that is obvious from your use of Python :-) It should<br>
> have clicked in me, but I am just so used to think "scientific<br>
> computation ~ simulations of nuclear bombs, aircraft wings, oil<br>
> platforms, and such" and that's when performance is the overriding<br>
> concern.<br>
<br>
That's a very common misconception. High-performance computing is the<br>
most visible part of scientific computing, but not what most<br>
scientists write code for. Mundane tasks such as file format<br>
conversion take much more of our time.<br>
<br>
That said, it's interesting to look at why Python became such a<br>
popular language in science. Python code rarely has great performance,<br>
but Python makes interfacing to C and Fortran code very easy. Most<br>
domain-specific scientific libraries for Python have a C library at<br>
their core. One reason for this simplicity of interfacing is Python's<br>
reference-counting approach to garbage collection. In many scientific<br>
applications, the bulk of the data is held in NumPy arrays, which is<br>
just a C/Fortran array plus some bookkeeping information for Python.<br>
Both sides can work on the data, with no copying and no access<br>
restrictions due to garbage collection.<br>
<br>
The price we pay for this is of course the dangers of C and its<br>
explicit memory management. I'd really love to get away from this (and<br>
I know I am not alone), but there is no GC-based language yet (as far<br>
as I know) that is convenient enough for scientific computing. The<br>
most frequent problem (also in Racket) is GC ruining performance for<br>
short-lived small data items (think of complex numbers or points in 3D<br>
space). Even if the GC overhead itself is small with a good<br>
generational GC, GC tends to prevent other code optimizations<br>
that are crucial for performance.<br>
<br>
Konrad.<br>
____________________<br>
Racket Users list:<br>
<a href="http://lists.racket-lang.org/users" target="_blank">http://lists.racket-lang.org/users</a><br>
</blockquote></div></div></div><br><br>-- <br>Sent from Gmail Mobile<br>