[racket-dev] plausible hash function for s16vectors

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Wed Nov 27 08:39:30 EST 2013

At Tue, 26 Nov 2013 13:39:31 -0800, John Clements wrote:
> My question: is there an accepted hash function for an s16vector, or
> more generally, for a big block of memory?

No, not currently.

> Taking a look at the behavior of vectors, though, it looks like *every* 
> element is considered in computing the hash. [...]
> Which suggests that every change to the vector changes the result of
> the hash function. This seems... really expensive!

The hashing functions are generally linear in the size of the value
being hashed. That's not currently documented, as it should be.

To hash a list, array, transparent structure, hash table, etc., each
element is hashed recursively, but there is currently a limit on the
depth of recursive hash calls to 128 so that the hash function doesn't
have to detect cycles. (Lists are treated differently from non-list
pairs in that hashing the `rest` doesn't count against the depth.)

> My current guess is that Racket 
> uses a highly optimized (a.k.a. no safety checks) hash function that works 
> over arbitrary blocks of data, 

No. In the case of a vector, for example, the hash function is called
recursively on each element of the vector.

> Questions:
> 
> 1) Am I guessing right?

Mostly.

> 2) Is this documented somewhere?

No, and I'll fix that.

> 3) Is there a generic memory-hash function in the unsafe interface somewhere?

Not currently.

> 4) Does the hash function affect the time taken by 'equal?' -- i.e.,
> the hash value is cached for faster equal? checking ?

No, `equal?` doesn't hash its arguments.


Posted on the dev mailing list.