netbsd-bugs: kern/11811: wddump kernel dumping failure

Subject: kern/11811: wddump kernel dumping failure
To: None <gnats-bugs@gnats.netbsd.org>
From: John Hawkinson <jhawk@mit.edu>
List: netbsd-bugs
Date: 12/24/2000 21:45:28
>Number:         11811
>Category:       kern
>Synopsis:       wddump kernel dumping failure
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Dec 24 21:45:00 PST 2000
>Closed-Date:
>Last-Modified:
>Originator:     John Hawkinson
>Release:        netbsd-current of 23 Dec 2000
>Organization:
MIT
>Environment:
	
System: NetBSD zorkmid.mit.edu 1.5O NetBSD 1.5O (ZORKMID-$Revision: 1.5 $) #67: Sat Dec 23 17:45:30 EST 2000 jhawk@zorkmid.mit.edu:/usr/local/netbsd-current/src/sys/arch/i386/compile/ZORKMID i386


>Description:

	I was single-stepping through some UBC code trying to figure out
why a process seemed to be hung (it was an executable under COMPAT_PE,
but I don't think that was really related. It was repeatedly getting
stuck in biowait(), and it appeared that uvm_fault was repeatedly
ubc_fault()-ing and calling genfs_getpages(); nevertheless, this is
probably not too relevent). I accidently single-stepped through a
trap and into apm 16-bit land, and so ddb died.

It then tried to dump core, but seemed to fail with:

dump panic: wddump: polled command has been queued
panic: wdc_exec_command: polled command not done

I'm really not sure I understand. Tracebacks follow.
	
	
>How-To-Repeat:
	
I ran /win98/wavelan/bin/Wsu10604.exe under COMPAT_PECOFF,
not expecting it to work, but just fooling around. My disk
light went solid and it sat there taking up loads of CPU
for no good reason, spinning around in uvm/ubc code.

I single-stepped at the wrong place, and the following
was left over in my message buffer:

uvm_fault(0xc0588e40, 0x5000, 0, 1) -> 1
fatal page fault in supervisor mode
trap type 6 code 0 eip c02f95ae cs 8 eflags 10046 cr2 5d6b cpl e000ffef
panic: trap
Begin traceback...
trap() at trap+0x1e5
--- trap (number 6) ---
db_read_bytes(5d6b,4,c6c42e0c,c0585a00,c6c42e48) at db_read_bytes+0x12
db_get_value(5d6b,4,0,0,c6c42f04) at db_get_value+0x18
db_stop_at_pc(c0585a00,c6c42e48) at db_stop_at_pc+0xee
db_trap(5,0,1,c6c42eb4,c07c5400) at db_trap+0x48
kdb_trap(5,0,c6c42eb4) at kdb_trap+0xc6
trap() at trap+0x168
--- trap (number 5) ---
param.c(b,c6c42f54,c6c42f54,c6c42f40,c03abfaa) at     0x5d6b
apmcall_debug(b,c6c42f54,281,c6c42f70,c03abfe4) at apmcall_debug+0x2d
apm_get_event(c6c42f54) at apm_get_event+0x12
apm_periodic_check(c07c5400,c07c5450,2,0,c07c5400) at apm_periodic_check+0x38
apm_thread(c07c5400) at apm_thread+0x20
End traceback...
syncing disks... 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 giving up

dumping to dev 0,1 offset 396196
dump panic: wddump: polled command has been queued
Begin traceback...
wddump(1,5115b8,c6c42b14,200,8081d) at wddump+0x1de
cpu_dump(100,c045791b,100,3,2) at cpu_dump+0x101
dumpsys(c6c42d80,c6c42d74,c01bc105,100,0) at dumpsys+0xed
cpu_reboot(100,0,c6c42db4,0,6) at cpu_reboot+0x63
panic(c045791b,e000ffef,4,5,bfbf9) at panic+0xcd
trap() at trap+0x1e5
--- trap (number 6) ---
db_read_bytes(5d6b,4,c6c42e0c,c0585a00,c6c42e48) at db_read_bytes+0x12
db_get_value(5d6b,4,0,0,c6c42f04) at db_get_value+0x18
db_stop_at_pc(c0585a00,c6c42e48) at db_stop_at_pc+0xee
db_trap(5,0,1,c6c42eb4,c07c5400) at db_trap+0x48
kdb_trap(5,0,c6c42eb4) at kdb_trap+0xc6
trap() at trap+0x168
--- trap (number 5) ---
param.c(b,c6c42f54,c6c42f54,c6c42f40,c03abfaa) at     0x5d6b
apmcall_debug(b,c6c42f54,281,c6c42f70,c03abfe4) at apmcall_debug+0x2d
apm_get_event(c6c42f54) at apm_get_event+0x12
apm_periodic_check(c07c5400,c07c5450,2,0,c07c5400) at apm_periodic_check+0x38
apm_thread(c07c5400) at apm_thread+0x20
End traceback...

dumping to dev 0,1 offset 396196
dump device not ready


panic: wdc_exec_command: polled command not done

Begin traceback...
wdc_exec_command(c07c5cf8,c6c42958) at wdc_exec_command+0xca
wd_flushcache(c07be000,10,c6c42994,c01b2c9d,c07be000) at wd_flushcache+0x4d
wd_shutdown(c07be000) at wd_shutdown+0xd
doshutdownhooks(c6c429c8,c6c429bc,c01bc105,104,0) at doshutdownhooks+0x25
cpu_reboot(104,0,c03159c4,c07be000,1) at cpu_reboot+0x68
panic(c045d460,2,c6c42b4c,c03157d0,1) at panic+0xcd
wddump(1,5115b8,c6c42b14,200,8081d) at wddump+0x1de
cpu_dump(100,c045791b,100,3,2) at cpu_dump+0x101
dumpsys(c6c42d80,c6c42d74,c01bc105,100,0) at dumpsys+0xed
cpu_reboot(100,0,c6c42db4,0,6) at cpu_reboot+0x63
panic(c045791b,e000ffef,4,5,bfbf9) at panic+0xcd
trap() at trap+0x1e5
--- trap (number 6) ---
db_read_bytes(5d6b,4,c6c42e0c,c0585a00,c6c42e48) at db_read_bytes+0x12
db_get_value(5d6b,4,0,0,c6c42f04) at db_get_value+0x18
db_stop_at_pc(c0585a00,c6c42e48) at db_stop_at_pc+0xee
db_trap(5,0,1,c6c42eb4,c07c5400) at db_trap+0x48
kdb_trap(5,0,c6c42eb4) at kdb_trap+0xc6
trap() at trap+0x168
--- trap (number 5) ---
param.c(b,c6c42f54,c6c42f54,c6c42f40,c03abfaa) at     0x5d6b
apmcall_debug(b,c6c42f54,281,c6c42f70,c03abfe4) at apmcall_debug+0x2d
apm_get_event(c6c42f54) at apm_get_event+0x12
apm_periodic_check(c07c5400,c07c5450,2,0,c07c5400) at apm_periodic_check+0x38
apm_thread(c07c5400) at apm_thread+0x20
End traceback...

dumping to dev 0,1 offset 396196
dump device not ready


rebooting...

>Fix:
	
	Is something wrong with wddump? Is it unreasonable to expect it to
work from a trap in apmcall_debug()?
>Release-Note:
>Audit-Trail:
>Unformatted: