[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: HEADS UP: panic behaviour changed
On Sun, 1 Feb 2009, Robert Elz wrote:
> As I have said before, I don't really care which way the default
> for ddb.onpanic is set, but ...
> | 1) Not everyone runs X. Most servers use a serial console.
> Forget X when discussing this issue, X isn't an argument for anything,
> one way or the other. By the time X gets anywhere near the system,
> sysctl.conf has run, and the local system owner can trivially decide
> which behaviour works for them and insert the relevant line into
If you check the original email, X was the justification for making this
change. It's a bogus justification, but I don't think we can ignore it.
> | 3) There is a period of time between loading the kernel and when the rc
> | scripts run where you can't tweak the ddb_onpanic value.
> Yes, this is why the kernel needs a default value, and why we can
> sensibly discuss what that default value should be. By itself it
> doesn't say anything about which particular value should be the
> default however.
The only time the default setting of this value is important is between
the time the power is turned on and the time the rc script is run that can
change the sysctl value. As soon as that happens it is no longer defalt
behavior but whatever the sysadmin managing the system desires.
> | 4) If the machine panics early, say during device configuration due to
> | broken hardware, you don't really want it to attempt to reboot, since
> | will result in an infinite reboot loop.
> Yes, perhaps - it depends upon the cause of the panic, but this
> can certainly happen. But I'm not sure this is any worse (or
> better) than the infinite loop the kernel is sitting in waiting
> for a reply to the db> prompt. Both require user interaction,
> and nothing proceeds until a user has done something to alter the
> state of the system.
Well, no. If the system drops to the db> prompt, then it requires user
intervention. Presumably, all the information about the cause of the
panic is also sitting there on the screen and has not scrolled off so the
admin can make an intelligent decision about what the corrective action
If, on the other hand the system is left to attempt to dump core and then
try an automatic reboot you have a lot of potentially distructive
operations that could happen.
Each time the system tries to reboot there will be a set of resets and
possibly power-cycles. Excessive resets or powercycling can potentially
damage integrated circuits through thermal cycling or disks though
> As an alternative, if the system panics due to a corrupted filesystem
> that was incorrectly marked clean, then ddb is of no practical use
> and a reboot will detect the unclean filesys and fsck (and either
> fix, or at least tell the user what the problem is).
We are talking very early in the boot process. I have never seen the case
where a filesystem is so corrupt that fsck is able to clean it but the
kernel still takes a panic after fsck runs. It used to be that if fsck
fixed certain problems in the root filesystem the rc scripts woult
automatically reboot the system. I assume that's still the case and a
reboot won't stop at the db> prompt.
OTOH, if you keep running fsck only part way on the filesystem, you may
end up doing irreparable damage to it.
And if the system manages to mount the filesystem and run savecore each
time before it gets to the panic, you end up filling up the root
filesystem with a series of useless coredumps.
Finally, if the system is suck in a panic loop, how do you diagnose the
problem? The system boots, prints a panic message, and then it resets
itself and starts printing the firmware messages which cause the panic
message to scroll off the screen. I suppose if you're lucky and you can
convince the machine to get into single-user mode, you can manually set
ddb_on_panic=1 and then switch to multi-user mode to continue diagnosis.
But if you can't get to the single-user shell you are SOL and probably
won't be able to figure out what's causing the problem let alone how to
Main Index |
Thread Index |