[plt-scheme] v299: process* file args, paths, strings, and byte-strings
At Tue, 19 Oct 2004 21:55:27 -0400, John Clements wrote:
> IIUC, MzScheme should 'know' when it gets a path that can't be
> expressed in the locale's encoding, right? So it should be possible to
> have a 'path->string/conservative' that signals an error if the path
> can't be expressed as a string?
Here's an implementation:
(define (path->string/conservative p)
(bytes->string/locale (path->bytes p)))
> I admit that this is a peculiar corner case, but it _would_ seem
> possible to have two independent file-system entities (A & B, say)
> whose paths mapped to the same string, a string that is interpreted by
> a string-expecting system call to refer to B. So references to A would
> wind up (in system call) affecting B instead. Is my reading correct?
This is not a problem for system calls under Unix or OS X, because
system calls take bytes.
It's also not a problem for Windows system calls, which take strings.
You can construct a bad path using `bytes->path' where the bytes are
not a valid encoding, but that's where a new MzScheme hack takes over.
When MzScheme converts a path to a UTF-16 string for a Windows system
call, invalid encoding bytes are converted to "\t" --- which is not
allowed in a Windows path, so the file definitely won't exist. (The
UTF-16 is only for the system call, so "\t" doesn't show up in any
MzScheme error message.)
That's why byte strings are a reliable representation of paths within
MzScheme, and why I didn't follow Java's lead by representing paths as
strings. When you try to talk to another application through strings,
though, lots can go wrong, and MzScheme does the best that I can figure
out.
Matthew