Subject: Re: RelCache (aka ELF prebinding) news
To: Bang Jun-Young <junyoung@netbsd.org>
From: Thor Lancelot Simon <tls@rek.tjls.com>
List: tech-userlevel
Date: 12/01/2002 23:12:16
On Mon, Dec 02, 2002 at 12:42:58PM +0900, Bang Jun-Young wrote:
> On Sun, Dec 01, 2002 at 10:20:34PM -0500, Thor Lancelot Simon wrote:
> > 
> > So, the checksum is effectively just used as a "magic number", AFAICT.  Since
> > it is not actually checked at runtime, it's really hard to see what benefit
> > this has over either a simpler checksum that _is_ checked or a random number
> > that need never be computed from the (potentially quite large) input at all.
> 
> In the future, it might be used to check if the file is modified from the
> distribution.

If you think you may want to actually check the hash in the future, how
about adding a "checksum type" field to your data structure and reserving
some values for future use?  That seems a lot more sensible than doing all
the work of taking a cryptographic hash of the object files when you are
really just using it as a 128-bit unique identifier.

> > You're concerned about collisions between 128-bit random numbers?  I'm not
> > sure if you're serious, but if you are, I'll point out that you can easily
> > enough add some bits -- a property that MD5 doesn't really have.
> 
> At first I decided to use CRC32, but some people pointed out that it's
> too weak to be used as hash. If CRC32 was used, could you explain how to
> avoid hash collision between files? Although collision would very rarely
> occur, but people want a perfect one.

I don't understand why you're using the term "hash" here.  Can you explain?

As far as I can tell, all you actually need is a unique identifier for an
object file.  A simple 128-bit random number (256 bits, if you prefer)
will serve the purpose _just as well_ as the MD5 hash of the actual data;
with enough bits, you won't see collisions, and it's much cheaper to grab
some more random bits than it is to take the MD5 of every object file.

-- 
 Thor Lancelot Simon	                                      tls@rek.tjls.com
   But as he knew no bad language, he had called him all the names of common
 objects that he could think of, and had screamed: "You lamp!  You towel!  You
 plate!" and so on.              --Sigmund Freud