[plt-scheme] concurrency bug in tcp-connect

From: Anton van Straaten (anton at appsolutions.com)
Date: Thu Jun 5 13:38:53 EDT 2003

I came across a bug in MzScheme's tcp-connect, on Windows, when used
concurrently from multiple threads.  The problem is that lookups of
hostnames interfere with each other, so that some threads end up connecting
to the wrong host, i.e. to a host specified in a different thread.

Here's a little test program that demonstrates the problem:

(define thrs
  (list
    (thread (lambda ()
    	(tcp-connect "www.plt-scheme.org" 80)))
    (thread (lambda ()
    	(tcp-connect "schematics.sourceforge.net" 80)))
    (thread (lambda ()
    	(tcp-connect "lambda.weblogs.com" 80)))))

Using netstat after running this will most likely show three connections to
www.plt-scheme.org (or its IP address, 155.98.63.210), and no connection to
either of the other two hosts.  (Note: as it happens, I'm running these
tests on a dual-CPU box, but I don't think that would affect this particular
issue - see below.)

A bit of investigation shows that the problem seems to be in the function
MZ_GETHOSTBYNAME in mzscheme/src/network.c.  There's a static buffer,
ghbn_hostname, which is being written to without any kind of synchronization
check.  Moving this code down into the synchronized section, after
"ghbn_lock = 1", fixes this problem, and has worked so far in my tests.
I've appended a diff.

--Anton

Index: network.c
===================================================================
RCS file: /cvs/plt/src/mzscheme/src/network.c,v
retrieving revision 1.98
diff -c -r1.98 network.c
*** network.c	1 Jun 2003 21:51:41 -0000	1.98
--- network.c	5 Jun 2003 17:31:17 -0000
***************
*** 550,566 ****
    long th;
    DWORD id;

-   if (strlen(name) < 256)
-     strcpy(ghbn_hostname, name);
-   else
-     return NULL;
-
    rec = MALLOC_ONE_ATOMIC(GHBN_Rec);
    rec->done = 0;

    scheme_block_until(ghbn_lock_avail, NULL, NULL, 0);

    ghbn_lock = 1;

    th = _beginthreadex(NULL, 5000,
  		      (MZ_LPTHREAD_START_ROUTINE)gethostbyname_in_thread,
--- 550,566 ----
    long th;
    DWORD id;

    rec = MALLOC_ONE_ATOMIC(GHBN_Rec);
    rec->done = 0;

    scheme_block_until(ghbn_lock_avail, NULL, NULL, 0);

    ghbn_lock = 1;
+
+   if (strlen(name) < 256)
+     strcpy(ghbn_hostname, name);
+   else
+     return NULL;

    th = _beginthreadex(NULL, 5000,
  		      (MZ_LPTHREAD_START_ROUTINE)gethostbyname_in_thread,



Posted on the users mailing list.