On Mon, Oct 06, 2008 at 12:35:33PM -0700, Bill Stouder-Studenmund wrote: > On Thu, Oct 02, 2008 at 12:19:07PM +0100, David Brownlee wrote: > > I've been testing a current revivesa kernel (built from > > sources updated today) on several NetBSD 4.0_STABLE installs, > > and everything seems to be happy on the single core boxes > > (firefox3, gnash, pidgin, ktorrent3, apache, etc), but I've > > seen several hard lockups when running on an MP box (Thinkpad > > T60p Intel Core 2 T7400). > > Have you tried this with the -current libpthread and update recepie? > > A lot of stuff changed in -current, and the MP hangs could be unrelated to > SA specifically. Or it could be some of the code not in kern_sa.c. If I can't reproduce it locally, we may need to do some printf debugging. You can also turn on SA kernel debugging and see if anything pops up before the hang. So I think I've made some progress on what all is wrong with the UP case. I commented earlier about how the problem seems to be a double-unblock. I now don't think that's it. Looking at the logs: 476 2 firefox-bin SAU unblocked, event=[<ctx=0xb87ffcf8, id=2, cpu=0>], intr=[<ctx=0xbfbfdcc0, id=4, cpu=0>] 476 2 firefox-bin RET sa_yield JUSTRETURN 476 2 firefox-bin CALL getcontext(0xb87ff918) 476 2 firefox-bin RET getcontext 0 476 3 firefox-bin SAU blocked, event=[<ctx=0xb87ffcf8, id=3, cpu=0>] 476 4 firefox-bin SAU blocked, event=[<ctx=0xb89ffcf8, id=4, cpu=0>] 476 4 firefox-bin CALL sa_yield 476 4 firefox-bin SAU unblocked, event=[<ctx=0xb89ffcf8, id=4, cpu=0>, <ctx=0xbfbfdc80, id=2, cpu=0>], intr=[<ctx=0xb87ff9b0, id=3, cpu=0>] There was a double-block event. That's really not supposed to happen, but it did. I think the problem with the double-unblock originally was that the contexts, at least as shown above, are in the wrong order. It looks like the second event frame for the double unblock is the interrupted for the unblocked way above, and the interrupted for the double unblock is the second blocked event. Looking at the ktrace you sent me off-list, the problem there also was a double-block. So the thing now is to figure out what's causing it and what can easily be done about it. Take care, Bill
Attachment:
pgpQ2qWoXqWeP.pgp
Description: PGP signature