Subject: Re: RelCache (aka ELF prebinding) news
To: Thor Lancelot Simon <tls@rek.tjls.com>
From: Bang Jun-Young <junyoung@netbsd.org>
List: tech-userlevel
Date: 12/02/2002 14:27:39
On Sun, Dec 01, 2002 at 11:12:16PM -0500, Thor Lancelot Simon wrote:
> On Mon, Dec 02, 2002 at 12:42:58PM +0900, Bang Jun-Young wrote:
> > On Sun, Dec 01, 2002 at 10:20:34PM -0500, Thor Lancelot Simon wrote:
> > > 
> > > So, the checksum is effectively just used as a "magic number", AFAICT.  Since
> > > it is not actually checked at runtime, it's really hard to see what benefit
> > > this has over either a simpler checksum that _is_ checked or a random number
> > > that need never be computed from the (potentially quite large) input at all.
> > 
> > In the future, it might be used to check if the file is modified from the
> > distribution.
> 
> If you think you may want to actually check the hash in the future, how
> about adding a "checksum type" field to your data structure and reserving
> some values for future use?  That seems a lot more sensible than doing all
> the work of taking a cryptographic hash of the object files when you are
> really just using it as a 128-bit unique identifier.

Sounds like a good idea.

> 
> > > You're concerned about collisions between 128-bit random numbers?  I'm not
> > > sure if you're serious, but if you are, I'll point out that you can easily
> > > enough add some bits -- a property that MD5 doesn't really have.
> > 
> > At first I decided to use CRC32, but some people pointed out that it's
> > too weak to be used as hash. If CRC32 was used, could you explain how to
> > avoid hash collision between files? Although collision would very rarely
> > occur, but people want a perfect one.
> 
> I don't understand why you're using the term "hash" here.  Can you explain?

Because the same value should be able to be regenerated from the same data
so that it is used to check validity of the file. That wouldn't be possible
with random number generator.

> 
> As far as I can tell, all you actually need is a unique identifier for an
> object file.  A simple 128-bit random number (256 bits, if you prefer)
> will serve the purpose _just as well_ as the MD5 hash of the actual data;
> with enough bits, you won't see collisions, and it's much cheaper to grab
> some more random bits than it is to take the MD5 of every object file.

Jun-Young

-- 
Bang Jun-Young <junyoung@netbsd.org>