[plt-scheme] Re: A segmentation fault on OS X Intel
At Sun, 13 Jan 2008 21:11:59 +0000, "Noel Welsh" wrote:
> On Jan 13, 2008 7:32 PM, Noel Welsh <noelwelsh at gmail.com> wrote:
> > I don't know enough about the internals to know whether i) the BLAS
> > routines are indeed using SSE (though they really should) and ii) if
> > I'm looking at the right ABI (perhaps it should be the 64-bit one?)
> > I'll do a bit more digging.
>
> I can't find any mention that the BLAS routines use SSE (or any other
> SIMD extensions). In fact the header file suggests they don't.
>
> The 64-bit ABI requires 8-byte alignment, but I cannot find any
> mention that 8-byte alignment is required for 32-bit code. In fact
> the docs explicitly state this is not the case.
>
> So I currently have no idea why the BLAS code would fail without
> 8-byte alignment.
My guess is that it's a bug (i.e., an assumption, perhaps
unintentional, in the BLAS code). Mac OS X's malloc() always produces
16-byte-aligned results, so the problem wouldn't normally show up, if
it is a bug.
Meanwhile, I've changed the 3m GC's default to provide 8-byte alignment
on all platforms. It doesn't seem to cost much --- 1-2% for DrScheme's
initial footprint --- and it seems healthier to have aligned `double's.
I considered setting the GC to 16-byte alignment on those platforms
where malloc() guarantees it, which includes Windows and Mac OS X. But
16-byte alignment increases DrScheme's initial footprint by almost 10%,
which seems too expensive.
Matthew