Subject: Re: Unpredictable reboots.
To: Peter Seebach <seebs@plethora.net>
From: Brian Buhrow <buhrow@lothlorien.nfbcal.org>
List: port-i386
Date: 03/07/2005 09:25:43
	Hello Peter.  I have a NetBSD-2.0 machine with a P4 hyperthreaded
processor in it.  It is also one of our production mail servers.  When I
enable SMP in the kernel, I get the same panic you get within tabout the
same amount of time.  The process which is running when the panic occurs is
always sendmail as well.
	When the machine runs with a uni-processor kernel, all is well.
So, what ever the problem is, it has something to do with multiprocessor
functionality, and networking.
That's probably not very much news, but I hope it helps.
-Brian
On Mar 5,  8:24am, Peter Seebach wrote:
} Subject: Re: Unpredictable reboots.
} In message <d0c1op$oci$1@colwyn.zhadum.de>, Matthias Scheler writes:
} >In article <200503041442.j24EgmgM000055@guild.plethora.net>,
} >	seebs@plethora.net (Peter Seebach) writes:
} >> Anyone else seeing anything like this?
} >
} >No:
} 
} Interesting.  I put in a new power supply, just for luck, and went back to a
} DIAGNOSTIC kernel.  This morning, it was wedged at KASSERT(to_ticks >= 0).
} Backtrace said it was executing sendmail, and the backtrace was through
} tcp_output.
} 
} I then saw the exact same panic FIVE more times.  Always within about ten
} seconds of starting sendmail.
} 
} So, I did the obvious thing; made a new kernel which prints to_ticks and sets
} it to zero.
} 
} It's hit that 31 times in 11 minutes.
} 
} I think it's safe to say I can reproduce this.  What this doesn't leave me
} with is any clue how to fix it, or get better debugging info.  But at least I
} have the machine running again in the mean time.  Of course, as you'd expect,
} this is a production server, and I can't make it happen on anything else.
} 
}    5 to_ticks: -150
}    6 to_ticks: -200
}    6 to_ticks: -250
}    9 to_ticks: -300
}    3 to_ticks: -350
} 
} Not sure if the distribution of times means anything.  I suppose next up would
} be adding stack traces and adding debugging code to whatever's calling this.
} 
} And yes, I applied the patches indicated in the PR (29134).
} 
} -s
>-- End of excerpt from Peter Seebach