Subject: kern/29400: Syncing disk fails
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <kessi@teles.de>
List: netbsd-bugs
Date: 02/16/2005 15:05:01
>Number:         29400
>Category:       kern
>Synopsis:       Syncing disk fails
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Feb 16 15:05:00 +0000 2005
>Originator:     Jens Kessmeier
>Release:        2.0
>Organization:
TELES AG
>Environment:
NetBSD iswitch2.teles.de 2.0 NetBSD 2.0 (GENERIC) #0: Wed Dec  1 10:58:25 UTC 2004  builds@build:/big/builds/ab/netbsd-2-0-RELEASE/i386/200411300000Z-obj/big/builds/ab/netbsd-2-0-RELEASE/src/sys/arch/i386/compile/GENERIC i386

>Description:
Some times server kernel software crashes the system and this
results in a kernel core dump. This nice feature, the kernel core dump,
helps to find the faulty code. So it is very, very useful.
On our scsi machines there is offten no kernel core dump after a crash.
So, i coded down a sample LKM with a panic call (see below) and
set the ddb.onpanic kernel option to zero. Then i start the LKM.
The last message on the console was syncing disks... and nothing else happened. Fuck, reset, next try, everything works fine, next try, syncing disk.... nothing, next try and so on. I try to debug this, but
i give up. See the code below WITH_CALLOUT was not defined. Grummel, i decided now to move the panic call from the LKM entry to a callout callback (WITH_CALLOUT is defined). Start the LKM again. No kernel crash dump, but the system reboots every time. The output (/var/log/messges) is behind the sample LKM.

#include <sys/systm.h>
#include <sys/conf.h>
#include <sys/lkm.h>
#include <sys/errno.h>
#include <sys/param.h>

#define __STATIC__

MOD_MISC("tlspanic");

#ifdef WITH_CALLOUT

__STATIC__ void
tlspanic_panic(
	void	)
{
	panic("CALLOUT: 0x%x 0x%x 0x%x 0x%x 0x%x\n", 1, 2, 3, 4, 5);
}

__STATIC__ void
tlspanic_callout(
	void	*vp	)
{
	tlspanic_panic();
}

#endif

__STATIC__ int
tlspanic_handle_load(
	struct lkm_table	*lkmtp,
	int			 cmd	)
{
#ifdef WITH_CALLOUT
	static struct callout	c;
#endif
	if(lkmexists(lkmtp)) {
		return(EEXIST);
	}
#ifdef WITH_CALLOUT
	callout_init(&c);
	callout_reset(&c, 500, tlspanic_callout, NULL);
#else
	panic("BUGCHECK: 0x%x 0x%x 0x%x 0x%x 0x%x\n", 1, 2, 3, 4, 5);
#endif
	return(0);
}

__STATIC__ int
tlspanic_handle_unload(
	struct lkm_table	*lkmtp,
	int			 cmd	)
{
	return(0);
}

__STATIC__ int
tlspanic_handle_stat(
	struct lkm_table	*lkmtp,
	int			 cmd	)
{
	return(0);
}


int
tlspanic_lkmentry(
	struct lkm_table	*lkmtp,
	int			 cmd,
	int			 ver	)	
{
	DISPATCH(	lkmtp, cmd, ver,
			tlspanic_handle_load,
			tlspanic_handle_unload,
			tlspanic_handle_stat	);
}


Feb 16 15:00:23 iswitch2 /netbsd: panic: CALLOUT: 0x1 0x2 0x3 0x4 0x5
Feb 16 15:00:23 iswitch2 /netbsd: 
Feb 16 15:00:23 iswitch2 /netbsd: Begin traceback...
Feb 16 15:00:23 iswitch2 /netbsd: tlspanic_panic(0,c0896e50,c0afab60,0,c0896e7c) at tlspanic:tlspanic_panic+0x1a
Feb 16 15:00:23 iswitch2 /netbsd: softclock(0,c0896e70,c04f5569,c0896e7c,c012702d) at netbsd:softclock+0x259
Feb 16 15:00:23 iswitch2 /netbsd: softintr_dispatch(0,5f090010,30,13dd0010,10) at netbsd:softintr_dispatch+0x73
Feb 16 15:00:23 iswitch2 /netbsd: Xsoftclock() at netbsd:Xsoftclock+0x25
Feb 16 15:00:23 iswitch2 /netbsd: --- interrupt ---
Feb 16 15:00:23 iswitch2 /netbsd: cpu_switch(c07c97e0,0,c0896f48,c036f8ee,1) at netbsd:cpu_switch+0x9f
Feb 16 15:00:23 iswitch2 /netbsd: ltsleep(c07c9640,4,c06e83dc,0,0) at netbsd:ltsleep+0x323
Feb 16 15:00:23 iswitch2 /netbsd: uvm_scheduler(c07c9620,0,c07d619c,c06cd1f8,0) at netbsd:uvm_scheduler+0x74
Feb 16 15:00:23 iswitch2 /netbsd: main(0,0,0,0,0) at netbsd:main+0x683
Feb 16 15:00:23 iswitch2 /netbsd: End traceback...
Feb 16 15:00:23 iswitch2 /netbsd: syncing disks... 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 giving up
Feb 16 15:00:23 iswitch2 /netbsd: 
Feb 16 15:00:23 iswitch2 /netbsd: dumping to dev 4,1 offset 1511
Feb 16 15:00:23 iswitch2 /netbsd: dump i/o error
Feb 16 15:00:23 iswitch2 /netbsd: 
Feb 16 15:00:23 iswitch2 /netbsd: 
Feb 16 15:00:23 iswitch2 /netbsd: sd0(ahc1:0:6:0): polling command not done
Feb 16 15:00:23 iswitch2 /netbsd: panic: scsipi_execute_xs
Feb 16 15:00:23 iswitch2 /netbsd: Begin traceback...
Feb 16 15:00:23 iswitch2 /netbsd: scsipi_execute_xs(c0b8b8ac,c0896d3c,a,0,a) at netbsd:scsipi_execute_xs+0x1b9
Feb 16 15:00:23 iswitch2 /netbsd: scsi_scsipi_cmd(c0bd2400,0,c0896d3c,a,0) at netbsd:scsi_scsipi_cmd+0x2e
Feb 16 15:00:23 iswitch2 /netbsd: scsipi_command(c0bd2400,0,c0896d3c,a,0) at netbsd:scsipi_command+0x48
Feb 16 15:00:23 iswitch2 /netbsd: sd_flush(c0bd3600,3,5,0,100) at netbsd:sd_flush+0x6d
Feb 16 15:00:24 iswitch2 /netbsd: sd_shutdown(c0bd3600,0,0,100,100) at netbsd:sd_shutdown+0x23
Feb 16 15:00:24 iswitch2 /netbsd: doshutdownhooks(c504f160,0,c0896de4,c037c4d8,100) at netbsd:doshutdownhooks+0x2a
Feb 16 15:00:24 iswitch2 /netbsd: cpu_reboot(100,0,c0896e18,0,1) at netbsd:cpu_reboot+0x5e
Feb 16 15:00:24 iswitch2 /netbsd: panic(c504f160,1,2,3,4) at netbsd:panic+0x108
Feb 16 15:00:24 iswitch2 /netbsd: tlspanic_panic(0,c0896e50,c0afab60,0,c0896e7c) at tlspanic:tlspanic_panic+0x1a
Feb 16 15:00:24 iswitch2 /netbsd: softclock(0,c0896e70,c04f5569,c0896e7c,c012702d) at netbsd:softclock+0x259
Feb 16 15:00:24 iswitch2 /netbsd: softintr_dispatch(0,5f090010,30,13dd0010,10) at netbsd:softintr_dispatch+0x73
Feb 16 15:00:24 iswitch2 /netbsd: Xsoftclock() at netbsd:Xsoftclock+0x25
Feb 16 15:00:24 iswitch2 /netbsd: --- interrupt ---
Feb 16 15:00:24 iswitch2 /netbsd: cpu_switch(c07c97e0,0,c0896f48,c036f8ee,1) at netbsd:cpu_switch+0x9f
Feb 16 15:00:24 iswitch2 /netbsd: ltsleep(c07c9640,4,c06e83dc,0,0) at netbsd:ltsleep+0x323
Feb 16 15:00:24 iswitch2 /netbsd: uvm_scheduler(c07c9620,0,c07d619c,c06cd1f8,0) at netbsd:uvm_scheduler+0x74
Feb 16 15:00:24 iswitch2 /netbsd: main(0,0,0,0,0) at netbsd:main+0x683
Feb 16 15:00:24 iswitch2 /netbsd: End traceback...
Feb 16 15:00:24 iswitch2 /netbsd: 
Feb 16 15:00:24 iswitch2 /netbsd: dumping to dev 4,1 offset 1511
Feb 16 15:00:24 iswitch2 /netbsd: dump device not ready
Feb 16 15:00:24 iswitch2 /netbsd: 
Feb 16 15:00:24 iswitch2 /netbsd: 
Feb 16 15:00:24 iswitch2 /netbsd: rebooting...
Feb 16 15:00:24 iswitch2 /netbsd: NetBSD 2.0 (GENERIC) #0: Wed Dec  1 10:58:25 UTC 2004
Feb 16 15:00:24 iswitch2 /netbsd: 	builds@build:/big/builds/ab/netbsd-2-0-RELEASE/i386/200411300000Z-obj/big/builds/ab/netbsd-2-0-RELEASE/src/sys/arch/i386/compile/GENERIC
Feb 16 15:00:24 iswitch2 /netbsd: total memory = 127 MB
Feb 16 15:00:24 iswitch2 /netbsd: avail memory = 117 MB
Feb 16 15:00:24 iswitch2 /netbsd: BIOS32 rev. 0 found at 0xfd820
Feb 16 15:00:24 iswitch2 /netbsd: mainbus0 (root)
Feb 16 15:00:24 iswitch2 /netbsd: cpu0 at mainbus0: (uniprocessor)
Feb 16 15:00:24 iswitch2 /netbsd: cpu0: Intel Mobile Pentium II (686-class), 333.29 MHz, id 0x66a
Feb 16 15:00:24 iswitch2 /netbsd: cpu0: features 183f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR>
Feb 16 15:00:24 iswitch2 /netbsd: cpu0: features 183f9ff<PGE,MCA,CMOV,PAT,PSE36,MMX>
Feb 16 15:00:24 iswitch2 /netbsd: cpu0: features 183f9ff<FXSR>
Feb 16 15:00:24 iswitch2 /netbsd: cpu0: I-cache 16 KB 32B/line 4-way, D-cache 16 KB 32B/line 4-way
Feb 16 15:00:24 iswitch2 /netbsd: cpu0: L2 cache 256 KB 32B/line 4-way
Feb 16 15:00:24 iswitch2 /netbsd: cpu0: ITLB 32 4 KB entries 4-way, 2 4 MB entries fully associative
Feb 16 15:00:24 iswitch2 /netbsd: cpu0: DTLB 64 4 KB entries 4-way, 8 4 MB entries 4-way
Feb 16 15:00:25 iswitch2 /netbsd: cpu0: 16 page colors
Feb 16 15:00:25 iswitch2 /netbsd: pci0 at mainbus0 bus 0: configuration mode 1
Feb 16 15:00:25 iswitch2 /netbsd: pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
Feb 16 15:00:25 iswitch2 /netbsd: pchb0 at pci0 dev 0 function 0
Feb 16 15:00:25 iswitch2 /netbsd: pchb0: Intel 82443BX Host Bridge/Controller (rev. 0x03)
Feb 16 15:00:25 iswitch2 /netbsd: agp0 at pchb0: aperture at 0xf8000000, size 0x4000000
Feb 16 15:00:25 iswitch2 /netbsd: ppb0 at pci0 dev 1 function 0: Intel 82443BX AGP Interface (rev. 0x03)
Feb 16 15:00:25 iswitch2 /netbsd: pci1 at ppb0 bus 1
Feb 16 15:00:25 iswitch2 /netbsd: pci1: i/o space, memory space enabled
Feb 16 15:00:25 iswitch2 /netbsd: vga1 at pci1 dev 0 function 0: Intel i740 Graphics Accelerator (rev. 0x21)
Feb 16 15:00:25 iswitch2 /netbsd: wsdisplay0 at vga1 kbdmux 1: console (80x25, vt100 emulation)
Feb 16 15:00:25 iswitch2 /netbsd: wsmux1: connecting to wsdisplay0
Feb 16 15:00:25 iswitch2 /netbsd: pcib0 at pci0 dev 7 function 0
Feb 16 15:00:25 iswitch2 /netbsd: pcib0: Intel 82371AB PCI-to-ISA Bridge (PIIX4) (rev. 0x02)
Feb 16 15:00:25 iswitch2 /netbsd: piixide0 at pci0 dev 7 function 1
Feb 16 15:00:25 iswitch2 /netbsd: piixide0: Intel 82371AB IDE controller (PIIX4) (rev. 0x01)
Feb 16 15:00:25 iswitch2 /netbsd: piixide0: device disabled (at device)
Feb 16 15:00:25 iswitch2 /netbsd: uhci0 at pci0 dev 7 function 2: Intel 82371AB USB Host Controller (PIIX4) (rev. 0x01)
Feb 16 15:00:26 iswitch2 /netbsd: uhci0: interrupting at irq 10
Feb 16 15:00:26 iswitch2 /netbsd: usb0 at uhci0: USB revision 1.0
Feb 16 15:00:26 iswitch2 /netbsd: uhub0 at usb0
Feb 16 15:00:26 iswitch2 /netbsd: uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
Feb 16 15:00:26 iswitch2 /netbsd: uhub0: 2 ports with 2 removable, self powered
Feb 16 15:00:26 iswitch2 /netbsd: Intel 82371AB Power Management Controller (PIIX4) (miscellaneous bridge, revision 0x02) at pci0 dev 7 function 3 not configured
Feb 16 15:00:26 iswitch2 /netbsd: ppb1 at pci0 dev 17 function 0: Digital Equipment DECchip 21150 PCI-PCI Bridge (rev. 0x04)
Feb 16 15:00:26 iswitch2 /netbsd: pci2 at ppb1 bus 2
Feb 16 15:00:26 iswitch2 /netbsd: pci2: i/o space, memory space enabled, rd/line, wr/inv ok
Feb 16 15:00:26 iswitch2 /netbsd: fxp0 at pci0 dev 18 function 0: i82559 Ethernet, rev 8
Feb 16 15:00:26 iswitch2 /netbsd: fxp0: interrupting at irq 5
Feb 16 15:00:26 iswitch2 /netbsd: fxp0: Ethernet address 08:00:3e:2a:36:63
Feb 16 15:00:26 iswitch2 /netbsd: inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
Feb 16 15:00:27 iswitch2 /netbsd: inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
Feb 16 15:00:27 iswitch2 /netbsd: fxp1 at pci0 dev 19 function 0: i82559 Ethernet, rev 8
Feb 16 15:00:27 iswitch2 /netbsd: fxp1: interrupting at irq 10
Feb 16 15:00:27 iswitch2 /netbsd: fxp1: Ethernet address 08:00:3e:2a:36:62
Feb 16 15:00:27 iswitch2 /netbsd: inphy1 at fxp1 phy 1: i82555 10/100 media interface, rev. 4
Feb 16 15:00:27 iswitch2 /netbsd: inphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
Feb 16 15:00:27 iswitch2 /netbsd: ahc1 at pci0 dev 20 function 0: Adaptec aic7880 Ultra SCSI adapter
Feb 16 15:00:27 iswitch2 /netbsd: ahc1: interrupting at irq 10
Feb 16 15:00:27 iswitch2 /netbsd: ahc1: aic7880: Ultra Wide Channel A, SCSI Id=7, 16/253 SCBs
Feb 16 15:00:27 iswitch2 /netbsd: scsibus0 at ahc1: 16 targets, 8 luns per target
Feb 16 15:00:27 iswitch2 /netbsd: isa0 at pcib0
Feb 16 15:00:27 iswitch2 /netbsd: com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
Feb 16 15:00:27 iswitch2 /netbsd: com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
Feb 16 15:00:27 iswitch2 /netbsd: pckbc0 at isa0 port 0x60-0x64
Feb 16 15:00:27 iswitch2 /netbsd: pckbd0 at pckbc0 (kbd slot)
Feb 16 15:00:27 iswitch2 /netbsd: pckbc0: using irq 1 for kbd slot
Feb 16 15:00:27 iswitch2 /netbsd: wskbd0 at pckbd0: console keyboard, using wsdisplay0
Feb 16 15:00:27 iswitch2 /netbsd: pms0 at pckbc0 (aux slot)
Feb 16 15:00:27 iswitch2 /netbsd: pckbc0: using irq 12 for aux slot
Feb 16 15:00:27 iswitch2 /netbsd: wsmouse0 at pms0 mux 0
Feb 16 15:00:27 iswitch2 /netbsd: pcppi0 at isa0 port 0x61
Feb 16 15:00:27 iswitch2 /netbsd: midi0 at pcppi0: PC speaker
Feb 16 15:00:27 iswitch2 /netbsd: sysbeep0 at pcppi0
Feb 16 15:00:27 iswitch2 /netbsd: isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
Feb 16 15:00:27 iswitch2 /netbsd: npx0 at isa0 port 0xf0-0xff: using exception 16
Feb 16 15:00:27 iswitch2 /netbsd: fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
Feb 16 15:00:28 iswitch2 /netbsd: isapnp0: no ISA Plug 'n Play devices found
Feb 16 15:00:28 iswitch2 /netbsd: fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
Feb 16 15:00:28 iswitch2 /netbsd: Kernelized RAIDframe activated
Feb 16 15:00:28 iswitch2 /netbsd: scsibus0: waiting 2 seconds for devices to settle...
Feb 16 15:00:28 iswitch2 /netbsd: sd0 at scsibus0 target 6 lun 0: <IBM, DDRS-34560W, S97B> disk fixed
Feb 16 15:00:28 iswitch2 /netbsd: sd0: 4357 MB, 8387 cyl, 5 head, 212 sec, 512 bytes/sect x 8925000 sectors
Feb 16 15:00:28 iswitch2 /netbsd: sd0: sync (172.00ns offset 8), 16-bit (11.626MB/s) transfers, tagged queueing
Feb 16 15:00:28 iswitch2 /netbsd: boot device: sd0
Feb 16 15:00:28 iswitch2 /netbsd: root on sd0a dumps on sd0b
Feb 16 15:00:28 iswitch2 /netbsd: root file system type: ffs
Feb 16 15:00:28 iswitch2 /netbsd: wsdisplay0: screen 1 added (80x25, vt100 emulation)
Feb 16 15:00:28 iswitch2 /netbsd: wsdisplay0: screen 2 added (80x25, vt100 emulation)
Feb 16 15:00:28 iswitch2 /netbsd: wsdisplay0: screen 3 added (80x25, vt100 emulation)
Feb 16 15:00:28 iswitch2 /netbsd: wsdisplay0: screen 4 added (80x25, vt100 emulation)
Feb 16 15:00:24 iswitch2 savecore: reboot after panic: panic: trap
Feb 16 15:00:24 iswitch2 savecore: no dump, not enough free space in /var/crash


>How-To-Repeat:

>Fix: