[plt-scheme] change to `scheme/foreign'

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Wed Nov 4 14:28:52 EST 2009

As of v4.2.2.6 (now in SVN), the `_pointer' type in the FFI has changed
to mean "pointer to an address that is *not* managed by the GC". A new
`_gcpointer' type takes over the old meaning of `_pointer', which was
"pointer to an address that may or may not be managed by the GC".

This change could break some existing uses of the FFI, but I think it's
likely to repair many more existing uses than it breaks.


At first glance, `_gcpointer' is more convenient than `_pointer'. The
GC can just look at the address in a `_gcpointer' wrapper and decide
whether the address is within one of the pages of memory that the GC
manages. Also, given our past reliance on conservative GC, defining
`_pointer' to allow references to GCable memory seemed like the obvious
choice.

The problem is that a memory page used by one allocator, such as
malloc()/free(), can sometimes be released by that allocator and picked
up by another allocator, such as the GC. For example, suppose that
`make-encrypted-in', `make-encrypted', and `close-encrypted' are
supplied by a foreign library:

 (define-struct connection (in out))

 ....
 (let ([c (make-connection (make-encrypted-in ....)
			   (make-encrypted-out ....))])
   ....
   (close-encrypted (connection-in c))
   (close-encrypted (connection-out c)))

It's likely that `make-encrypted' allocates its result with malloc()
and `close-encrypted' releases its argument with free(). In that case,
on some platforms, the above program is broken. It could happen that

   1. `(close-encrypted (connection-in c))' frees memory with free();

   2. free() releases the page that uses to contain the address
      referenced by `(connection-in c)';

   3. before continuing with `(close-encrypted (connection-out c))',
      athread Scheme thread swaps in;

   4. the other thread allocates enough that the GC looks for a new
      page of memory, and get takes the one just released by free();

   5. the other thread continues to allocate and forces a GC; and

   6. the GC crashes, because the pointer in `(connection-in c)' still
      refers to the address of memory that was free()ed, and now that
      address points into the middle of a GC-managed block of memory.

The problem could be fixed with

 (let ([c (make-connection (make-encrypted-in ....)
			   (make-encrypted-out ....))])
   ....
   (let ([in (connection-in c)]
         [out (connection-out c)])
     (close-encrypted in)
     (close-encrypted out)))

This works because space safety ensures that `c' and `in' are no longer
referenced by the time `close-encrypted' is called. But this
fine-grained level of reachability is obviously difficult to reason
about.

The new meaning of `_pointer' avoids the above problem. The GC ignores
the address stored in the Scheme representation of the pointer.


Meanwhile, it seems that `_gcpointer' is rarely needed. When getting a
pointer back from a foreign library, it almost never refers to GCable
memory, because the GC would not have been able to track the pointer
(and update it when data is moved by the GC) within the foreign
library. In those cases, then, plain `_pointer' works. For pointer
values going the other direction --- from Scheme to a foreign library
--- `_pointer' and `_gcpointer' are the same.

There is an implicit use of `_gcpointer' when calling the `make-'
procedure bound by a `define-cstruct'. Since that's implicit, though,
no existing code must change to use it.

Similarly, there's an implicit use of `_gcpointer' when allocating data
with `malloc' in modes other than 'raw. Again, the use is implicit, so
no conversion is necessary.

Another change to `malloc' is that the mode now defaults to 'atomic
when a type based on `_pointer' is provided. The 'nonatomic mode is
used only when a given type is based on `_gcpointer' or `_scheme'.


For all of those reasons, I'm pretty sure that the change to `_pointer'
is a good idea, but let me know if it causes any trouble.



Posted on the users mailing list.