Subject: Re: kern/33076: reproducable pool free list corruption
To: None <,,>
From: Martin Husemann <>
List: netbsd-bugs
Date: 03/16/2006 14:05:06
The following reply was made to PR kern/33076; it has been noted by GNATS.

From: Martin Husemann <>
Subject: Re: kern/33076: reproducable pool free list corruption
Date: Thu, 16 Mar 2006 15:04:25 +0100

 Ok, with help from Frank van der Linden and Chuck Silvers I have examined
 this a bit more.
 The original problem happened with a SMP kernel, and core dumps there are
 quite fragile, so I never managed to get one.
 One suspicion was that pool operations on mbpool would happen without proper
 IPL - so I added a panic in pool_put and pool_get that would trigger if the
 pool was mbpool and current protection level < IPL_VM. This did not fire.
 During testing, a second variant of the corruption occured, in form of a
 kernel page fault inside pool_prime_page (called from pool_get). The pointer
 dereferenced was 0xffffffffffff. So I added options QUEUEDEBUG, and this
 catches the same corruption slightly earlier at the LIST_INSERT_HEAD.
 Still it does not point out where the corruption occurs.
 Now finally I repeated the experiment with a uniprocessor kernel and had the
 same result - this time, however, I was able to get a crash dump (on second
 try, so I'm not sure how correct the backtrace in it will be).
 I've uplodaded all relevant pieces to
 There is the kernel config file (MARTINS.UP), the kernel core and netbsd.gdb,
 as well as the small patch to subr_pool.c that I used to verify the pool_pug/get
 protection level.
 If I should guess, I would say something in the network stack is writing
 a 0xffffffffffffffff somewhere out of bounds.