Subject: Re: RelCache (aka ELF prebinding) news
To: Bang Jun-Young <junyoung@netbsd.org>
From: Thor Lancelot Simon <tls@rek.tjls.com>
List: tech-kern
Date: 12/03/2002 23:02:41
On Wed, Dec 04, 2002 at 12:30:46PM +0900, Bang Jun-Young wrote:
> > 
> > Even a perfect 32-bit identifier isn't good enough, by itself.  I strongly
> > suspect that a 32-bit identifier stamped into the file, plus information
> > from the metadata, probably is, in the real world.  The 64-bit identifier
> > that Bang is using now, consisting of the CRC-32 of the file followed by
> > the Adler-32 of the file, is probably good enough all by itself but it
> > seems silly to not use the file size and metadata to further reduce the
> > chance of collisions no matter how the identifier is chosen.
> 
> Last night I must be too sleepy (and you and Jason were right ;-) Okay, I
> will use the file size as well.
> 
> So I will use the following values for identification:
> 
>  - 32 bit CRC32
>  - 32 bit Adler32
>  - file size
>  - base address (determined by ld.elf_so for each process)

So, just to summarize:

1) Only root can write to the relocation cache area.  This eliminates
   the security concerns raised by Mouse.

2) You prefer not to include certain filesystem metadata that would
   invalidate cache entries if a library were moved or renamed.

Thus yielding the current approach.

The only thing left that I don't quite grasp is why you aren't using
the ELF object name, in addition to the base address and file size.
That would reduce the set of possible collisions enormously.

I'll point out once more two possibly minor things:

1) As Jonathan has pointed out, a Fletcher sum is probably better than
   an Adler sum for this purpose.  If you want an implementation, I'm
   sure he or I can send you one (or you can write one yourself in a
   couple of minutes, the Fletcher checksum is *really* simple).

2) If instead of using hashes at all, you used the dev, ino, gen
   triple for the library, plus the ctime and mtime, you'd have to
   re-prebind after restoring, but you could at least be sure that
   if the _kernel_ thought it was the same file, so would you; and
   as a few people have pointed out, nobody actually moves shared
   libraries around with any kind of frequency, and other Unix
   prebinding systems all require re-prebinding if you do, so maybe
   that's not the worst approach in the world.

Anyway, that's really all I have to say about the subject.  Thank
you _very_ much for spending so much time listening to suggestions,
and for doing the work in the first place.

Thor