Subject: Re: Problems with NetBSD-current kernels after 2005-10-04
To: Chuck Silvers <chuq@chuq.com>
From: Patrick Welche <prlw1@newn.cam.ac.uk>
List: current-users
Date: 10/23/2005 16:24:44
On Sat, Oct 22, 2005 at 01:54:34PM -0700, Chuck Silvers wrote:
> hi,
> 
> On Sat, Oct 22, 2005 at 08:18:07PM +0100, Patrick Welche wrote:
> > The info says - it's named, so not what you have been seeing:
> > 
> > Oct 22 19:23:02 quartz syslogd: Exiting on signal 15
> > uvm_fault(0xc0537b80, 0xdeadb000, 0, 1) -> 0xe
> > kernel: supervisor trap page fault, code=0
> > Stopped in pid 317.1 (named) at 0xdeadbeef:     invalid address
> > db{1}> bt
> > acpi_softc(cdad564c,c32bf800,cdb13f9c,c02d5d8b,0) at 0xdeadbeef
> > sa_switchcall(cdad564c,2b,2b,2b,2b) at netbsd:sa_switchcall+0x44
> > db{1}> sync
> > syncing disks... panic: TLB IPI rendezvous failed (mask 1)
> > Stopped in pid 317.1 (named) at netbsd:breakpoint+0x4:  leave
> > db{1}> sync
> > 
> > dump to dev 4,1 not possible
> > rebooting...
> > 
> > This happened with today's cvs, while shutting down.  I remember
> > restarting named when I had the freeze mentioned in the original
> > email, but also had a build going at the same time, so it wasn't
> > obviously the named restart...
> 
> ok, that does look like a bug in the sa change.  we're jumping through
> a function pointer that has 0xdeadbeef as the value.  most likely the
> sau is being freed before we try to use it, but I don't see where.
> 
> next time you see this, try "reboot 0x104" instead of "sync",
> that's more likely to succeed in getting a dump.
> 
> ...on second thought, the "not possible" message is because either
> no dump device is configured or the device is too small to hold a dump.

reboot 0x104 gets me an instant reboot without even an attempt to dump :-(

Patrick