[plt-scheme] 352.8

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Fri Oct 20 10:52:58 EDT 2006

MzScheme and MrEd are now version 352.8 in the SVN repository trunk.

The changes are related to obscure corner cases for filesystem paths. I
offer the following new recommendations for the two cases where I
previously would have recommended `path->bytes' and `bytes->path':

 * Use `path->bytes' and `bytes->path' for path marshaling only. Don't
   manipulate bytes from `path->bytes' to adjust a path.

 * Use `path-element->bytes', and `bytes->path-element' for
   manipulating path content, along with `split-path' and `build-path'.

   To manipulate non-ASCII characters in path names, convert
   path-element bytes to a string using `bytes->string/locale', but
   beware that the conversion may lose information (due to limitations
   of locale-based encodings).

More details:

Under Unix, (split-path "a/~") now returns the path form of "./~" as
its second argument, instead of just "~", since "~" is an absolute
path. To keep things symmetric, (build-path "a" "./~") produces the
path form of "a/~". In other words, "./" is used as a kind of encoding
prefix for relative paths that start with "~". Under Windows, there are
many, many more encoding examples, such as "\\\\?\\REL\\a " to
represent the name of a file that ends in a space (since trailing
spaces are normally ignored in Windows paths).

The problem with these encodings, if you stick with `path->bytes', is
that they get in the way of certain path manipulations. For example,
it's awkward to delete every file in a directory that is three ASCII
characters long. If you use `path->bytes' on the path that represents
the relative filename "~ab", you get #"./~ab", which looks like it's
five characters long instead of three.

The new `path-element->bytes' procedure produces just #"~ab" for the
path that represents the relative filename "~ab". Convert #"~ab" back
to a path using `bytes->path-element'. These path-element functions
always work in terms literal relative path elements instead of path
encodings (though there is still a layer of locale-specific encoding of
characters in terms of bytes).


The new `path->directory-path' procedure fills another small gap, which
is a way to create a path that refers to a directory (e.g., "a/b/"), as
opposed to potentially a file (e.g., "a/b").

Finally, `simplify-path' is now more aggressive about normalizing
paths, especially in the mode where it does not consult the filesystem.
It removes redundant path separators, converts Windows paths away from
"\\\\?\\" when possible, and so on. The `simplify-path' procedure still
doesn't normalize case, though, so continue to combine it with
`normal-case-path' as needed.


Matthew



Posted on the users mailing list.