Subject: Please audit pool use in your code!
To: None <tech-kern@netbsd.org>
From: Thor Lancelot Simon <tls@rek.tjls.com>
List: tech-kern
Date: 07/19/2006 13:22:59
Last week I sent a message to this list about UFS_DIRHASH and kernel memory
corruption.  Since then it has become clear that we have more serious issues,
at least on the 3.0 branch; removing UFS_DIRHASH has made our systems run
for significantly longer without crashing, but when they do crash, we see
the same basic symptom: pool-allocated objects are overwritten with bogus
data, often leading to a panic when a bad pointer is followed out of such
a structure.

One likely cause of this problem is allocation from a pool in interrupt
context.  Any such allocation *or free* (pool_put/pool_get) *must* be
protected with spl such that no other code allocating from that pool can
be entered while the allocation is in progress (e.g. by the same interrupt
occuring or by another interrupt leading a different code path to allocate
from that pool).

A quick grep through src/sys/net for PR_NOWAIT (which is a pretty strong
hint that the code in question may be reached from an interrupt) found some
problems in the SACK code, which Kentaro fixed.  However, there is a huge
amount of code in the kernel which allocates from pools, and some of it
does so "maybe" from interrupt context (e.g. setting the flags from a
"waitok" argument to the calling function, so that my grep would not
have found it).

I ask all developers to _please_ look at any code in which they have
used the pool allocator and double-check that any uses of pool_put/pool_get
which could be reached from interrupt context are bracketed by the correct
spl/splx calls to block such interrupts.

We haven't seen this problem with 3.0, but we see it with 3.0_STABLE.  If
anyone can think of a pullup which might include such problematic code,
please, please let me know.

This is why the Project's build servers keep crashing, and thus why
autobuilds are proceeding at a glacial pace. :-/

-- 
  Thor Lancelot Simon	                                     tls@rek.tjls.com

  "We cannot usually in social life pursue a single value or a single moral
   aim, untroubled by the need to compromise with others."      - H.L.A. Hart