Subject: Re: RelCache (aka ELF prebinding) news
To: None <tls@rek.tjls.com>
From: Jonathan Stone <jonathan@DSG.Stanford.EDU>
List: tech-userlevel
Date: 12/03/2002 17:02:32
Thor,

hey, thanks for the clear technical summary.

First: Adler-32 is a poor choice for small files, or for medium (a few
tens to hundreds of kbyte) dominated by a single octet value (e.g.,
lots of nulls).  I checked all the shared-libs on my handiest NetBSD box,
which happens to be an i386 running 1.6-release userland.

On that box, I see several very small files (less than 10k). Adler-32
is therefore a poor choice. The line of reasoning is the same as in
[Stone 2001], and the email to tsvwg which someone else turned into
RFC3309.  (If pushed, I can quantify just how poor next week; I need
to dust off some tools from a couple of years back).

Second: a CRC is not a particuarly good point in the cost/computation
tradeoff space for Bang's purpose.

>
>The sole purpose of this identifier is to ensure that ld.so does not
>mistake one legitimate .so file for another.  Deliberate attempts to
>generate hash collisions are beyond the scope; 

I hear what you're saying, but personally I disagree about the scope.
I can think of some attacks which I'd _really_ rather not fall victim
to.


Can we make the assumption that (for now):
	1.  we want a hash-value which is a *string hint* about the
	`identity' of a purported shared-library,

	2. That if the hint is questionable, there are often-good
	ways of validating that  identity short of recomputing
	the hash itself (e.g,. checking device/inumber/generation values,
	and  whatever else Jason suggested)

Then (assuming we decide to use a hash at all) i see a clear case for
md4 or md5.  But ...  why not use just the filesystem-metadata,
and do away with the hash altogether?