Subject: Re: Problems with NetBSD-current kernels after 2005-10-04
To: Chuck Silvers <chuq@chuq.com>
From: Patrick Welche <prlw1@newn.cam.ac.uk>
List: current-users
Date: 10/23/2005 15:57:37
On Sat, Oct 22, 2005 at 01:54:34PM -0700, Chuck Silvers wrote:
> hi,
>
> On Sat, Oct 22, 2005 at 08:18:07PM +0100, Patrick Welche wrote:
> > The info says - it's named, so not what you have been seeing:
> >
> > Oct 22 19:23:02 quartz syslogd: Exiting on signal 15
> > uvm_fault(0xc0537b80, 0xdeadb000, 0, 1) -> 0xe
> > kernel: supervisor trap page fault, code=0
> > Stopped in pid 317.1 (named) at 0xdeadbeef: invalid address
> > db{1}> bt
> > acpi_softc(cdad564c,c32bf800,cdb13f9c,c02d5d8b,0) at 0xdeadbeef
> > sa_switchcall(cdad564c,2b,2b,2b,2b) at netbsd:sa_switchcall+0x44
> > db{1}> sync
> > syncing disks... panic: TLB IPI rendezvous failed (mask 1)
> > Stopped in pid 317.1 (named) at netbsd:breakpoint+0x4: leave
> > db{1}> sync
> >
> > dump to dev 4,1 not possible
> > rebooting...
> >
> > This happened with today's cvs, while shutting down. I remember
> > restarting named when I had the freeze mentioned in the original
> > email, but also had a build going at the same time, so it wasn't
> > obviously the named restart...
I can reproduce this at will simply with "/etc/rc.d/named restart".
> ok, that does look like a bug in the sa change. we're jumping through
> a function pointer that has 0xdeadbeef as the value. most likely the
> sau is being freed before we try to use it, but I don't see where.
>
> next time you see this, try "reboot 0x104" instead of "sync",
> that's more likely to succeed in getting a dump.
>
> ...on second thought, the "not possible" message is because either
> no dump device is configured or the device is too small to hold a dump.
Bother: I tried boot -a, dump on sd1e, but got
kernel: supervisor trap page fault, code=0
Stopped in pid 337.1 (named) at 0xdeadbeef: invalid address
db{0}> bt
acpi_softc(cda7464c,c0513100,cd81ff9c,c02d0813,0) at 0xdeadbeef
sa_switchcall(cda7464c,2b,2b,2b,2b) at netbsd:sa_switchcall+0x44
db{0}> sync
panic: TLB IPI rendezvous failed (mask 2)
Stopped in pid 337.1 (named) at netbsd:breakpoint+0x4: leave
db{0}> sync
dump to dev 4,12 not possible
rebooting...
The dump partition doesn't *have* to be swap does it?
# size offset fstype [fsize bsize cpg/sgs]
e: 8401200 1218174 4.2BSD 1024 8192 46168 # (Cyl. 239*- 1893*)
(it wasn't mounted)
I'll try the reboot 0x104 next...
Cheers,
Patrick