[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: sparse dumps (was: WAPL panic)
On Nov 9, 2012, at 08:01, Chuck Silvers <chuq%chuq.com@localhost> wrote:
> On Wed, Nov 07, 2012 at 02:22:49PM +0100, Edgar Fu wrote:
>>> Try to get a sparse dump via machdep.sparse_dump=1
>> How long is that supposed to take?
>> It said "dump", paused for a few seconds, then counted from 44 down to 38
>> then nothing happened for minutes. Until I hit the virtual reset button.
> I tried triggering a sparse dump (with "reboot -qd") on amd64
> and after a number of tries I did see the hang during the dump.
> but even when it doesn't hang, the resulting sparse dump is not valid:
> savecore: kvm_read: invalid translation (invalid level 4 PDE)
> sparse dumps appear to be a bit too sparse.
> after I fixed that (and the problem that causes the kernel to spew
> "pmap_kenter_pa: mapping already present"), the next problem was that savecore
> generates a useless kernel image file, so you need to ignore the one
> from savecore and use the kernel image you actually booted. this isn't
> specific to sparse dumps, it happens with both normal and sparse dumps.
> but once I get past all that, sparse dumps work for me on amd64.
> ... I later tried triggering a dump from ddb with "reboot 0x104"
> to make sure that my fix for the "mapping already present" thing
> would work in this context as well (since the last attempt to fix that
> resulted in a different hang), and I found that rebooting from ddb
> currently always hangs. I traced it as far as cpu_shutdown(),
> and it's not surprising that the xcalls from that also cause problems.
> I'm inclined to have pmf_system_shutdown() return without doing anything
> if panicstr is set, since the context in which this is called could cause
> a hang for any driver shutdown hook. does anyone have any other ideas
> on what to do about this?
> the attached patch fixes the amd64 kernel problems with sparse dumps for me,
> could you give that a try?
I have tested your patches for NetBSD-current on VMware Fusion (under Mac OSX).
Breaking into ddb and entering "reboot 0x104" results in a good core dump. As
you note, the kernel copy is invalid.
Thanks for the patches! I cannot remember the last time I was able to get a
workable core dump on amd64.
PS "vmstat -M netbsd.0.core -N /netbsd" results in
vmstat: can't dereference kptr 0x7f7fffffd780
vmstat: invalid translation (invalid level 4 PDE)
adding specific options, e.g., -e , work fine.
PPS Is it now safe to enable core dumps on systems where the dump partition is
a sub-partition of a raidframe RAID 1 partition? This used to warned against
in the old raidframe documentation but the warnings are gone in recent versions.
Main Index |
Thread Index |