[racket-dev] Symlink trouble

From: Tobias Hammer (tobias.hammer at dlr.de)
Date: Wed Apr 17 06:06:29 EDT 2013

Hi,

i am currently implementing an application that heavily relies on rackets  
great serialize functionality to exchange data between racket processes on  
different computers. That works well until i stumbled over a very  
confusion behavior of rackets filesystem and module path resolution.

I will explain first, what i observed and then why this causes some  
trouble:
* relative (module) paths are resolved with something like (or  
(current-load-directory) (current-directory))
* collection paths are resolved with
  (find-executable-path (find-system-path 'exec-file) (find-system-path  
'collects-dir)) for the system collection and with the given path for the  
others
* you can require a module relative and via collection, if they resolve to  
the same name, there is no error

serialize stores the module path and symbol where the deserialize function  
can be found. It's interesting how this module path is determined
* If the file containing the deserialize identifier (if implemented by  
hand or the file where e.g serializable-stuct is used) is loaded via  
collection, then the serialized stream contains a collection path  
(determined via identifier binding and mpi magic)
* If this file is loaded relative, the fallback method with  
current-(load)-directory is used

Nothing special so far, but the fun starts with how current-directory is  
initialized. It uses (on *nix systems) getcwd() but this function returns  
the path with all symbolic links resolved (getcwd is only a thin  
OS-wrapper, and the OS provides nothing else).
This little detail can easily break the serialization framework (and maybe  
other things too).
The scenario is a file that is in a path containing a symlink and that is  
in the current collections, e.g
/abc/symlink/more/def/file.rkt
and PLTCOLLECTS="/abc/symlink/more:"
and file.rkt contains a serializable-struct definition.

Now one racket process loads "file.rkt" relative, serializes a struct  
instance and sends it to another racket process. The other process loads  
def/file via collection and deserialies the struct. The receiver now has a  
struct that is of a different type and that he can't access.
This fails because the serialized data contains the absolute symlink-free  
path that differs from the path the receiver used to load file.rkt  
(because for collection dirs symlinks are not resolved).

The same happens of course when the data is send to another computer that  
has a symlink in the path to file.rkt, even if they both load the same way.

The confusing thing is that from the users point of view everything is  
consistent. His working directory and collections all point to the same  
location.

It is clear that this behavior is by far not limited to racket as nearly  
all programming languages use getcwd internally. A quick google search for  
getcwd and symlinks gives a lot of results...

I came up with a few solutions but i would like to get some feedback on  
them. They all more or less use that the shell keeps track of the 'real'  
(better: visible) working directory. Most *nix shells set 'PWD' in the  
environment but it is not guaranteed and can of cause be altered by the  
user.

- The quick and very dirty hack is to set the current-directoy before any  
use code is executed
racket -e '(current-directory (or (getenv "PWD") (current-directory)))'  
program.rkt
Too ugly to really use it...

- A better fix would be to change how the current-directory parameter is  
initialized during the startup. It could be some heuristic that tries to  
use the env-variable if it is a complete and existing path and falls back  
to getcwd otherwise. As far as i can tell this won't break anything  
because after this one time at startup the C-sides cwd and rackets  
parameter are completely decoupled.

- A more conservative solution would be a command line argument to racket  
to set the initial value for current-directory. One could then populate it  
with env's PWD or from `pwd` or whatever suits.

I would appreciate any feedback on how i can work around this behavior  
(except don't use symlinks ...) or if i missed something obvious. If not,  
would any of the two real solutions be viable? They shouldn't be too hard  
to implement i could create a patch if one of them seems ok.

Tobias



-- 
---------------------------------------------------------
Tobias Hammer
DLR / Robotics and Mechatronics Center (RMC)
Muenchner Str. 20, D-82234 Wessling
Tel.: 08153/28-1487
Mail: tobias.hammer at dlr.de

Posted on the dev mailing list.