Subject: Re: 3.1_STABLE and SMP
To: Andrew Doran <ad@netbsd.org>
From: Stephen Borrill <netbsd@precedence.co.uk>
List: port-i386
Date: 04/24/2007 10:08:37
On Fri, 20 Apr 2007, Andrew Doran wrote:
> On Fri, Apr 20, 2007 at 12:46:53PM +0100, Stephen Borrill wrote:
>
>>> Do you have a DDB stack backtrace, assuming you can get into DDB from
>>> the hung state?
>>
>> I got emailed a screenshot of one:
>> http://projects.precedence.co.uk/netbsd/ddb1.jpg
>
> I have seen reports of something similar. In addition to what Greg
> mentioned, it's possible that:
>
> - this CPU holds the kernel lock and is spin waiting on the pmap lock
> - another cpu holds the pmap lock, has taken an interrupt, and is spin
>  waiting on the kernel lock
>
> That should not happen unless there is a bug somewhere. How many CPUs does
> the machine have?

2 - it's a dual core P4

> If it happens again, could you ask the custy to do a:
>
> mach cpu 0
> tr
> mach cpu 1
> tr
>
> A dump of the held simplelocks would be good to get too

I've given them the instructions and installed a LOCKDEBUG kernel for 
them. So we're just waiting for the next crash that will allow them to get 
to ddb (not all do apparently).

Thanks,

-- 
Stephen