[racket-dev] atomic file update by write & rename... not!

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Thu Jan 13 16:13:14 EST 2011

A common trick to atomically update a file "x" on a Unix filesystem is
to write a file with a temporary name like "tmp-x-7687564" (making up a
fresh name each time) in the same directory as "x" and then use
`rename-or-file-directory' to move "tmp-x-7687564" to "x".

Unfortunately, this trick does not work for Windows. Don't use it for
anything that needs to be portable.


Here's why the trick works for Unix, at least when there's a single
writer and multiple readers: If any process has "x" open while it is
being replaced, the process continues to read from the old file, even
though the file doesn't correspond to the file that is named "x" on the
filesystem. The original file continues to exist until it is closed by
all readers, while new readers of "x" get the new file. Furthermore,
Unix guarantees that the rename is atomic; there's no time between "x"
ceasing to refer to the old file and starting to refer to the new file.

There are two problems with this approach under Windows. First, file
rename is not guaranteed to be atomic (although it does seem to be
"atomic enough" in everything I've tried, for whatever that's worth).
Second, and more significantly in practice, while a file is opened for
reading, you cannot use its name for a new file. You can delete a file
while it's being read, but not only does the deleted file continue to
exist, the delete file's *name* remains occupied until all uses of the
file are closed. So, if "x" is opened by any readers, you cannot rename
"tmp-x-7687564" to "x" --- not even if you use the flag on the Win32
call to say that it's ok to replace the target file.

Oddly enough, you're allowed to rename a Windows file while it is open.
You can even delete the renamed file and then use the old name for a
new file, even if the original file remains open by a reader who
accessed it by the original name. Unfortunately, that possibility
doesn't help to implement atomic update; if you follow a move "x" to
"old-x-7687564" with a move of "tmp-x-7687564" to "x", then for a
significant period of time, there would be no "x" file.

(BTW, I'm not clear on which constraints and behaviors are technically
"Windows" and which are technically "NTFS", but it hardly matters.)


The solution? I haven't been able to find a solution that lets you just
replace `rename-file-or-directory' with something else. The relatively
new support for transactions in NTFS doesn't help unless all I/O is
converted to use transactions, and transaction failures must be handled
explicitly; it's way too painful. The only solutions I can find involve
restructuring programs to add some new kind of synchronization.

To support a parallel build, for example, Kevin had to change `raco
setup' so that multiple processes do not try to compile the same file.
(As it turns out, there were reasons to make the change independent of
problems with atomic update, but the atomic-update problem finally
forced us to make the change.)

We haven't yet fixed the preferences file. I'll post a separate message
on that one.

Is there anything else in the main distribution that is uses
`rename-file-or-directory' for atomic update?


Posted on the dev mailing list.