Subject: Bummer - NetBSD 2.1 panic - si_refs - update
To: None <port-macppc@netbsd.org>
From: Donald Lee <MacPPC2@caution.icompute.com>
List: port-macppc
Date: 01/29/2007 14:19:14
Update on this....

I've looked over the kernel source to see if I can determine what's wrong, and
what I may be able to do to work around the problem.  I went to 2.1 to
get a stable server, and *really* don't want to upgrade again
already.

It appears that the code in arch/powerpc/powerpc/softintr.c is almost
unchanged between NetBSD 2.1 and current.

The panic on (si_refs > 0) is still roughly at line 116 in the code.

si_refs is still only used within the softintr.c module.

The functions declared in softintr.c are only used in a few places, mainly
clock interrupts, serial ports and network "soft" intrs.

There appear to be two conclusions:

1. The bug is still present in -current.  I can't reproduce it at will,
but it appears that it is a bug within the softintr.c module, and the code
has not changed much.  (caveat - it could also be memory corruption - yuck!)

2. I can greatly reduce the usage of the code in softintr.c by not running
dial-in PPP.  If this becomes an issue for me, I can move my dial-in work
to a separate machine.


I've only seen one crash so far, after running about a week.  Let's hope
I don't see any more.  I'll post again when/if I learn more.

-dgl-

At 2:10 PM -0600 1/29/07, Donald Lee wrote:
>My shiny new server panic'd this evening.
>
>Panic: kernel diagnostic assert "si->si_refs > 0" failed: file "....
>	arch/powerpc/powerpc/softintr.c" line 116
>
>
>I looked in the CVS repository, and si_refs looks pretty local to this file,
>and this file does not appear to have been changed appreciably since its
>initial creation.
>
>Anyone seen this bug, or have any ideas about it?  It is apparently
>complaining that the soft interrupt queue(s) are corrupt.
>
>????
>
>-dgl-