Subject: kern/24410: Deadlock in sys_generic.c/kern_synch.c
To: None <gnats-bugs@gnats.netbsd.org>
From: Christian Biere <christianbiere@gmx.de>
List: netbsd-bugs
Date: 02/13/2004 08:24:35
>Number: 24410
>Category: kern
>Synopsis: Deadlock in sys_generic.c/kern_synch.c
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Feb 13 08:25:00 UTC 2004
>Closed-Date:
>Last-Modified:
>Originator: Christian Biere
>Release: NetBSD 1.6ZJ
>Organization:
>Environment:
System: NetBSD Cyclonus 1.6ZJ NetBSD 1.6ZJ (STARSCREAM) #3: Wed Feb 11 11:27:20 CET 2004 root@Cyclonus:/usr/src/sys/arch/i386/compile/STARSCREAM i386
Architecture: i386
Machine: i386
$NetBSD: kern_sa.c,v 1.47 2004/01/02 18:52:17 cl Exp $
$NetBSD: kern_synch.c,v 1.140 2004/01/04 13:27:53 kleink Exp $
$NetBSD: sys_generic.c,v 1.80 2003/10/10 15:24:28 chs Exp $
>Description:
Currently my system locks up about once in 24hrs mostly triggered by
leaving the box alone and accessing big files when coming back like
burning a CD or verifying a checksum.
bt
cpu_Debugger(c0320d80,c71f5424,c6c6ce5c,c01ab6e8,c02f51f4) at netbsd:cpu_Debugge
r+0x4
comintr(c09c9c00,c,c6c60010,c01a0030,c02f0010) at netbsd:comintr+0x5f9
Xintr_ioapic_edge4() at netbsd:Xintr_ioapic_edge4+0x92
--- interrupt ---
netbsd:ffs_genfsops:
db> ps
PID PPID PGRP UID S FLAGS LWPS COMMAND WAIT
26984 7348 26984 1002 2 0x4002 1 bash ttyin
7348 4682 7348 1002 2 0x4100 1 aterm select
6559 15405 6559 1002 2 0x4002 1 openssl uvn_fp2
15405 9905 15405 1002 2 0x4002 1 bash wait
9905 4682 9905 1002 2 0x4100 1 aterm select
23919 8123 23919 1000 2 0x400a 1 mutt poll
6043 1648 6043 1002 2 0x4002 1 links select
8123 1 8123 1000 2 0x4002 1 bash wait
8554 1 8554 0 2 0x4002 1 bash ttyin
1648 6235 1648 1002 2 0x4002 1 bash wait
6235 4682 6235 1002 2 0x4101 1 aterm select
>How-To-Repeat:
Run X, run Bittorrent
Come back after 8-20 hrs.
Run Mozilla.
Run md5 A_BIG_FILE.
>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted:
>29663 7441 7441 1002 2 0x4402 2 python2p2 *
7441 353 7441 1002 2 0x4002 1 sh wait
23977 1 23977 0 2 0x4002 1 getty ttyin
1771 1 1771 0 2 0x4002 1 getty ttyin
4697 359 359 1002 2 0x4400 3 MozillaFirebird- *
359 4682 359 1002 2 0x4000 1 sh wait
353 4879 353 1002 2 0x4002 1 bash wait
4879 4682 4879 1002 2 0x4101 1 aterm select
4682 4917 4917 1002 2 0x4008 1 blackbox select
460 4917 4917 1002 2 0x4000 1 gkrellm poll
290 4917 4917 1002 2 0x4008 1 bbkeys select
4917 4778 4917 1002 2 0x4000 1 sh wait
294 4778 294 1002 2 0x4000 1 XFree86 select
4778 1 4754 1002 2 0x4000 1 xinit wait
4462 1 4462 1015 2 0x100 1 syslogd poll
4603 1 4603 1007 2 0x100 1 ntpd pause
96 1 96 0 2 0x4002 1 getty ttyin
4561 1 4561 0 2 0 1 cron nanosle
236 1 236 0 2 0 1 mount_mfs mfsidl
4144 1 4144 0 2 0 1 mount_mfs mfsidl
11 1 11 0 2 0 1 mount_mfs mfsidl
9 0 0 0 2 0x20200 1 aiodoned aiodone
8 0 0 0 2 0x20200 1 ioflush syncer
7 0 0 0 2 0x20200 1 pagedaemon pgdaemo
6 0 0 0 2 0x20200 1 atapibus0 sccomp
5 0 0 0 2 0x20200 1 atabus1 atath
4 0 0 0 2 0x20200 1 atabus0 atath
3 0 0 0 2 0x20200 1 pms0 pmsrese
2 0 0 0 2 0x20200 1 sysmon smtaskq
1 0 1 0 2 0x4000 1 init wait
0 -1 0 0 2 0x20200 1 swapper schedul
db> cont
Stopped in pid 29663.7 (python2p2) at netbsd:cpu_Debugger+0x4: leave
db> bt
cpu_Debugger(0,3f9,302c5f8d,7fe,c09cc000) at netbsd:cpu_Debugger+0x4
comintr(c09c9c00,c,10,c7200030,c71f0010) at netbsd:comintr+0x5f9
Bad frame pointer: 0xc09c8780
db> sync
syncing disks...
simple_lock: lock held
lock: 0xc02f51f4, currently at: ../../../../kern/sys_generic.c:981
last locked: ../../../../kern/kern_synch.c:421
last unlocked: ../../../../kern/kern_sa.c:867
selwakeup(c03211ac,3fd,0,d,c72071d4) at netbsd:selwakeup+0xa1
logwakeup(c02ca7ca,5,0,0,c6c6cac0) at netbsd:logwakeup+0x9d
printf(c02ca7ca,0,c6c6cae4,c013cafa,100) at netbsd:printf+0x75
vfs_shutdown(d,30,c6c60010,c02b8560,d) at netbsd:vfs_shutdown+0x31
cpu_reboot(100,0,c6c6cbc4,c01712ff,30) at netbsd:cpu_reboot+0x18a
db_sync_cmd(30,0,168e14,c6c6cb2c,10) at netbsd:db_sync_cmd+0x24
db_command(c02fd510,c02b8560,c023dcf8,c022ff74,d) at netbsd:db_command+0xef
db_command_loop(c022ff74,73df,7,c720735d,0) at netbsd:db_command_loop+0x8c
db_trap(1,0,c0170ca0,c0304920,c022ff74) at netbsd:db_trap+0xdd
kdb_trap(1,0,c6c6cd80,1,1) at netbsd:kdb_trap+0x12f
trap() at netbsd:trap+0xda
--- trap (number 1) ---
cpu_Debugger(0,3f9,302c5f8d,7fe,c09cc000) at netbsd:cpu_Debugger+0x4
comintr(c09c9c00,c,10,c7200030,c71f0010) at netbsd:comintr+0x5f9
Bad frame pointer: 0xc09c8780
~~wdc_atapi_intr: unknown phase 0x1
done
unmounting /c (/dev/cgd0a)...
panic: ltsleep: l_stat 8 != LSONPROC
Stopped in pid 29663.7 (python2p2) at netbsd:cpu_Debugger+0x4: leave
db> sync
dumping to dev 0,1 offset 9095
dump panic: wddump: polled command has been queued
Stopped in pid 29663.7 (python2p2) at netbsd:cpu_Debugger+0x4: leave
db> sync
dumping to dev 0,1 offset 9095
dump device not ready
panic: wdc_exec_command: polled command not done
Stopped in pid 29663.7 (python2p2) at netbsd:cpu_Debugger+0x4: leave
db> sync
dumping to dev 0,1 offset 9095
dump device not ready
panic: kernel diagnostic assertion "_simple_lock_held((&sched_lock)) == 0"
failed: file "../../../../kern/kern_synch.c", line 679
Stopped in pid 29663.7 (python2p2) at netbsd:cpu_Debugger+0x4: leave
db> reboot
rebooting...