eq and hashing (was Re: [plt-scheme] How to make unit functors?)

From: andrew cooke (andrew at acooke.org)
Date: Thu May 29 10:46:50 EDT 2003

[warning - copied as email; you may want to reply to the list.  what's
correct protocol on this list?]

Matthew Flatt <mflatt at cs.utah.edu> writes:
> Again, it sounds like you had to keep the item (such as 'a) in the
> tree, so there's no bug.

  (if i understand correctly) they're using the hash to locate the
  item in the tree.  the item itself would only be used once you got
  to the leaf (since two items could have the same hash value they'd
  be located at the same leaf)

> The underlying principle here is that you're going to compare the items
> eventually (since a hash key does not uniquely determine an item), so
> the hash key will work correctly, by definition.

  they only do the comparison on the subset (often empty) of items
  that have the same hash code.
 
  surely this problem also occurs with traditional hashed dictionaries
  (using arrays + some kind of clash resolution), not just trees?  you
  have a (key, value) pair, take the hash of the key and store the
  value at "that index" (with some local resolution over a restricted
  subset of items that map to the same index in the table).  if you
  then use the key to recover the item, agin you start with the hash
  code to find the index and do comparisons over a small subset.

  to make this explicit, consider storing "foo" with key 'j and (for
  simplicity) a hashtable so big that array indices are hash values.
  at the time of storage the hash code of 'j is 42, so we put "foo" at
  index 42.  later, after garbage collection, we want to recover the
  value associated with 'j.  so we get the hashcode of 'j, which is
  now 666, and don't find anything.  that's not right, surely?

  apologies if i've missed some scheme-specific point (i'm just
  lurking here).  maybe eq-hash should only be used to test
  (memory-address based) equality and there are other hash functions
  that should be used when using the hash in data structures?

  andrew

-- 
http://www.acooke.org



Posted on the users mailing list.