[plt-scheme] FFI and pointer manipulations

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Sun Feb 11 00:06:56 EST 2007

At Sat, 10 Feb 2007 23:49:11 -0500, Eli Barzilay wrote:
> On Feb 11, Jens Axel Søgaard wrote:
> > Eli Barzilay wrote:
> > 
> > >On Feb  9, Jens Axel Søgaard wrote:
> > >  
> > >It's possible to do arbitrary conversions using the foreign interface,
> > >just like you do in C -- except that you have to put something in a
> > >malloced space, and reference it as something else.  No "typecast"
> > >function yet.  (Partly because foreign types are not completely first
> > >class, which is yet to be hacked.)  But:
> > >
> > I'm not sure, I follow. Here is an example of what I want to do:
> > 
> > (let ([bs (bytes 0 1 2 3 4 5)])
> >   (md5 <cpointer-to-bs+2> 3))
> > 
> > should calculate the md5 of the 3 bytes 2 3 4.
> > 
> > I can't figure out what to write instead of <cpointer-to-bs+2>.
> 
> Here's an example of what you want -- I have this function in x.so:
> 
>   void foo(char *a) { printf("received %d: \"%s\"\n", (int)a, a); }
> 
> and I do this interaction in MzScheme which should make it clear how
> to play with pointers:
> 
>   > (define p (get-ffi-obj "foo" "~/tmp/x.so" (_fun _pointer -> _void)))
>   > (define buf #"0123456789")
>   > (p buf)
>   received 180307632: "0123456789"
>   > (define tmp (malloc (max (ctype-sizeof _int) (ctype-sizeof _pointer))))
>   > (ptr-set! tmp _pointer buf)
>   > (ptr-ref tmp _int)
>   180307632
>   > (ptr-set! tmp _int (+ 3 (ptr-ref tmp _int)))
>   > (ptr-ref tmp _int)
>   180307635
>   > (ptr-ref tmp _bytes)
>   #"3456789"
>   > (p (ptr-ref tmp _pointer))
>   received 180307635: "3456789"

To be clear, the above can cause a crash. A GC can happen after a
pointer to `buf''s string is installed into `tmp', in which case the
string bound to `buf' is likely to move.

At best, `(ptr-ref tmp _bytes)' prints garbage. At worst, the location
formerly occupied by the string has become unmapped, so that a
`(ptr-ref tmp _bytes)' attempts to access unmapped memory and crashes.

(The pointer arithmetic is irrelevant in this case. Things can go bad
merely because the GC doesn't traverse the memory allocated for `tmp'.
If you fix that problem somehow, then the pointer arithmetic causes
problems.)

> The fact that C doesn't store the pointer is not too relevant -- you
> have to be very careful for everything that is done in Scheme code,
> because as long as you're there a GC can happen -- and in 3m this
> means that pointers are likely to change.

Right. The bottom line is that you cannot pass a C function a pointer
into the middle of a GCable object right now. Even if the foreign
function performs no memory operations, the process of calling a
foreign function does allocate. So, the offset would have to be applied
at the last minute before the foreign function is called, which means
that the offset operation has to be built into foreign-function
application.

> Moreover, according to what Matthew told me (which might not be true
> now), it can even make the GC crash because something it thinks is a
> pointer to a block of memory is actually pointing to the middle of a
> real object.

This is still true, and it's unlikely to change.

Matthew



Posted on the users mailing list.