Subject: port-i386/13288: savecore causes a double panic
To: None <gnats-bugs@gnats.netbsd.org>
From: Greg A. Woods <woods@weird.com>
List: netbsd-bugs
Date: 06/23/2001 14:14:37
>Number:         13288
>Category:       port-i386
>Synopsis:       savecore causes a double panic
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    port-i386-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Jun 23 11:13:00 PDT 2001
>Closed-Date:
>Last-Modified:
>Originator:     Greg A. Woods
>Release:        2001/06/19
>Organization:
Planix, Inc.; Toronto, Ontario; Canada
>Environment:

System: NetBSD proven 1.5W NetBSD 1.5W (PROVEN) #2: Tue Jun 19 21:48:56 EDT 2001 woods@proven:/work/woods/NetBSD-src/sys/arch/i386/compile/PROVEN i386
Architecture: i386
Machine: i386

>Description:

	savecore causes a double panic as it tries to read the crash
	dump from the dump file, which of course then causes the system
	to go into an endless crash-reboot-crash cycle (if it's
	configured to reboot on panic)....

	Here's the console output from one:

	       Checking for core dump...
	       dumplo = 318001152 (621096 * 512)
	       panic: getblk: block size invariant failed
	       Begin traceback...
	       getblk(d02aabec,f7720,800,0,0) at getblk+0xd0
	       breadn(d02aabec,f7720,800,d0a8ce38,d0a8ce3c) at breadn+0x2b
	       spec_read(d0a8ce80,d0a8ce94,c01cbe3e,d0a8ce80,0) at spec_read+0x23f
	       ufsspec_read(d0a8ce80,0,1000,d0290600,d0a8ce80) at ufsspec_read+0x2d
	       vn_read(d0290600,d0a8cf34,d0a8cecc,c08a3f00,0) at vn_read+0xba
	       dofileread(d0286ab4,8,d0290600,8189ff0,1000) at dofileread+0x93
	       sys_pread(d0286ab4,d0a8cf80,d0a8cf78) at sys_pread+0xe8
	       syscall_plain(1f,1f,1f,1f,1000) at syscall_plain+0x98
	       End traceback...
	       syncing disks... panic: lockmgr: locking against myself
	       Begin traceback...
	       lockmgr(d02aac70,10012,d02aabec,d0a8cc98,c01cc3b3) at lockmgr+0x556
	       genfs_lock(d0a8cc8c) at genfs_lock+0x18
	       vn_lock(d02aabec,10012,d02aabec,d0a8ccd8,d0a8ccfc) at vn_lock+0x63
	       vget(d02aabec,10012) at vget+0xbe
	       ffs_sync(c0923000,2,c08a3f00,d0286ab4) at ffs_sync+0x9c
	       sys_sync(d0286ab4,0,0,100,c0389440) at sys_sync+0x56
	       vfs_shutdown(d0a8cd88,d0a8cd7c,c01aadb5,100,0) at vfs_shutdown+0x64
	       cpu_reboot(100,0,0,c4a14488,800) at cpu_reboot+0x3b
	       panic(c0389440,d0286ab4,800,200,0) at panic+0xf5
	       getblk(d02aabec,f7720,800,0,0) at getblk+0xd0
	       breadn(d02aabec,f7720,800,d0a8ce38,d0a8ce3c) at breadn+0x2b
	       spec_read(d0a8ce80,d0a8ce94,c01cbe3e,d0a8ce80,0) at spec_read+0x23f
	       ufsspec_read(d0a8ce80,0,1000,d0290600,d0a8ce80) at ufsspec_read+0x2d
	       vn_read(d0290600,d0a8cf34,d0a8cecc,c08a3f00,0) at vn_read+0xba
	       dofileread(d0286ab4,8,d0290600,8189ff0,1000) at dofileread+0x93
	       sys_pread(d0286ab4,d0a8cf80,d0a8cf78) at sys_pread+0xe8
	       syscall_plain(1f,1f,1f,1f,1000) at syscall_plain+0x98
	       End traceback...
	       
	       dumping to dev 4,1 offset 621096
	       dump 191 190 ^]
	       telnet> send brk
	       189 Stopped in pid 113 (savecore) at        cpu_Debugger+0x4:       leave
	       db> 

	Note that even breaking the dump as I did above, which normally
	creates an invalid dump image that savecore will refuse to try
	to load, doesn't prevent another crash.  The mere attempt by
	savecore to access the dump device seems to cause the crash....

>How-To-Repeat:

	not sure -- the original panic was long scrolled off the top of
	my console terminal's screen buffer since the machine had been
	in a loop of rebooting and crashing again for over six hours.

	try causing a crash dump and then try 'savecore'  :-)

	I don't really relish the idea of trying to repeat this, at
	least not until I get another test machine running....

>Fix:

	unknown

	as a temporary work-around just set "savecore=NO" in
	/etc/rc.conf, and/or 'sysctl -w ddb.onpanic=1'  :-)

>Release-Note:
>Audit-Trail:
>Unformatted: