Subject: Re: Don't use UFS_DIRHASH
To: Thor Lancelot Simon <tls@rek.tjls.com>
From: David Malone <dwmalone@maths.tcd.ie>
List: tech-kern
Date: 07/15/2006 10:36:09
On Wed, Jul 12, 2006 at 12:31:22PM -0400, Thor Lancelot Simon wrote:
> Removing UFS_DIRHASH from our kernel configuration made the problem go
> away.  Though it is possible that there is an underlying problem of some
> kind in one of the allocators that is simply particularly badly exposed
> by UFS_DIRHASH, it seems more likely that there is a problem (which we
> haven't found yet) in UFS_DIRHASH itself.  The code has a history of
> similar problems on FreeBSD which seem to have ended only when the entire
> kernel synchronization scheme in FreeBSD was reworked in FreeBSD 5.

FWIW, in FreeBSD I think we've tracked down all outstanding memory
corruption bugs in which DIRHASH was implicated and they all turned
out to be problems in other subsystems that used memory allocations
of the same size. It works fine both in FreeBSD 4, where we have
spl style synchronisation and in 5+, where the synchronisation is
mutex based. It is quite possible that we've missed something, but
we're not seeing any evidence of it in FreeBSD right now.

The most recent one that we've plugged was one related to IPv6 neighbour
discovery:

	http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet6/nd6.c#rev1.63

The DEBUG_MEMGUARD option was quite useful in tracking this down - I
wonder if something similar might help narrow down what's happening in
your case?

	David.