Subject: Re: Unpredictable reboots.
To: None <port-i386@NetBSD.org>
From: Peter Seebach <seebs@plethora.net>
List: port-i386
Date: 03/05/2005 08:24:48
In message <d0c1op$oci$1@colwyn.zhadum.de>, Matthias Scheler writes:
>In article <200503041442.j24EgmgM000055@guild.plethora.net>,
>	seebs@plethora.net (Peter Seebach) writes:
>> Anyone else seeing anything like this?
>
>No:

Interesting.  I put in a new power supply, just for luck, and went back to a
DIAGNOSTIC kernel.  This morning, it was wedged at KASSERT(to_ticks >= 0).
Backtrace said it was executing sendmail, and the backtrace was through
tcp_output.

I then saw the exact same panic FIVE more times.  Always within about ten
seconds of starting sendmail.

So, I did the obvious thing; made a new kernel which prints to_ticks and sets
it to zero.

It's hit that 31 times in 11 minutes.

I think it's safe to say I can reproduce this.  What this doesn't leave me
with is any clue how to fix it, or get better debugging info.  But at least I
have the machine running again in the mean time.  Of course, as you'd expect,
this is a production server, and I can't make it happen on anything else.

   5 to_ticks: -150
   6 to_ticks: -200
   6 to_ticks: -250
   9 to_ticks: -300
   3 to_ticks: -350

Not sure if the distribution of times means anything.  I suppose next up would
be adding stack traces and adding debugging code to whatever's calling this.

And yes, I applied the patches indicated in the PR (29134).

-s