Subject: kern/36395: _fstrans_start panic while executing umount
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: Bernd Ernesti <pr200703@veego.de>
List: netbsd-bugs
Date: 05/28/2007 19:55:00
>Number:         36395
>Category:       kern
>Synopsis:       _fstrans_start panic while executing umount
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon May 28 19:55:00 +0000 2007
>Originator:     Bernd Ernesti
>Release:        NetBSD 4.99.20
>Organization:
	
>Environment:
System: NetBSD 4.99.20
Architecture: i386
Machine: i386
>Description:
	I got a panic while executing a sync with an imediately following umount -a

_fstrans_start with held simple_lock 0xc07fa1e4 CPU 1 /src/sys/kern/vfs_syscalls.c:630
uvm_fault(0xd03b8934, 0, 1) -> 0xe   
kernel: supervisor trap page fault, code=0
Stopped in pid 18146.1 (umount) at      netbsd:db_read_bytes+0x30:      movl    0 (%esi),%eax
db{1}> bt      
db_read_bytes(1,4,d6a1e7f4,e18dba1c,a1e7f8) at netbsd:db_read_bytes+0x30 
db_get_value(1,4,0,72747366,5f736e61) at netbsd:db_get_value+0x27
db_stack_trace_print(d6a1e8e0,1,ffff,c07590b9,c03e1b20) at netbsd:db_stack_trace_print+0x527
simple_lock_only_held(0,c0681c71,0,c03e36da,0) at netbsd:simple_lock_only_held+0x104
_fstrans_start(c3cda000,1,1,0,0) at netbsd:_fstrans_start+0x24
ffs_fsync(d6a1e9e8,17665be,d6a1ea0c,c044fe00,d6a1e9f0) at netbsd:ffs_fsync+0x32
VOP_FSYNC(e18dba1c,ffffffff,5,0,0) at netbsd:VOP_FSYNC+0x49
vinvalbuf(e18dba1c,1,ffffffff,d0466240,0) at netbsd:vinvalbuf+0x18e
vclean(e18dba1c,1877330,1,c03e25b7,0) at netbsd:vclean+0x92
vgonel(e18dba1c,d0466240,108,c07fa200,5c5) at netbsd:vgonel+0x38
getcleanvnode(0,c07665be,214,d04d66c4,c42bf000) at netbsd:getcleanvnode+0xf6
getnewvnode(15,c42bf000,c35bab00,d6a1eb4c,c07f8974) at netbsd:getnewvnode+0xb0
vfs_allocate_syncvnode(c42bf000,c0766b03,276,d0466240,265) at netbsd:vfs_allocate_syncvnode+0x32
dounmount(c42bf000,0,d0466240,c42bf000,0) at netbsd:dounmount+0x3a5
sys_unmount(d0466240,d6a1ec48,d6a1ec68,804b008,804b000) at netbsd:sys_unmount+0x126
syscall_plain() at netbsd:syscall_plain+0x16a
--- syscall (number 22) ---
db{1}> ps/l
 PID         LID S     FLAGS       STRUCT LWP *            UAREA * WAIT
>18146     >   1 7 0x20000004         0xd0466240         0xd6a1ece0
 8159          1 3      0x84         0xd37820e0         0xd2582ce0 kqread
 27727         1 3      0x84         0xd3459600         0xd37f2ce0 pause
 10671         1 3      0x84         0xd3c1de00         0xd3cc7ce0 poll
 1082          1 3     0x284         0xd0550740         0xd0f2ece0 nfsiod
 740           1 3     0x284         0xd05508e0         0xd0daece0 nfsiod
 1081          1 3     0x284         0xd0550a80         0xd0c8fce0 nfsiod
 692           1 3     0x284         0xd0550dc0         0xd053ece0 nfsiod
 798           1 3      0x80         0xd04663e0         0xd052bce0 ttyin 
 733           1 3      0x80         0xd0466580         0xd0528ce0 ttyin 
 801           1 3      0x80         0xd0466720         0xd0525ce0 ttyin 
 573           1 3      0x80         0xd03b53c0         0xd0445ce0 ttyin 
 623           1 3      0x80         0xcf338040         0xd005cce0 ttyin 
 794           1 3      0x84         0xcf3381e0         0xd0058ce0 nanoslp
 759           1 3      0x84         0xd04668c0         0xd0522ce0 nanoslp
 747           1 3      0x84         0xd0466a60         0xd04e6ce0 poll
 766           1 3      0x84         0xd0466c00         0xd0448ce0 kqread
 741           1 3      0x84         0xd0466da0         0xd04e9ce0 kqread
 763           1 3      0x84         0xd03b5220         0xd040cce0 kqread
 539           1 3      0x84         0xd03b5a40         0xd00a2ce0 select  
 488           1 3      0x84         0xd03b58a0         0xd0406ce0 pause
 432           1 3      0x84         0xd03b5be0         0xd03fbce0 nfsd
 434           1 3      0x84         0xd03b5d80         0xd03f8ce0 nfsd
 435           1 3      0x84         0xd00cd060         0xd03f5ce0 nfsd
 422           1 3      0x84         0xd00cd200         0xd014fce0 nfsd
 376           1 3      0x84         0xd00cd3a0         0xd03f2ce0 poll
 416           1 3      0x84         0xd00cd540         0xd00afce0 select
 359           1 3      0x84         0xd00cd6e0         0xd00acce0 select
 301           1 3      0x84         0xd00cd880         0xd00a9ce0 poll
 246           1 2       0x4         0xd00cdd60         0xd005fce0
 63            1 3     0x204         0xd00cda20         0xd00a6ce0 physiod
 23            1 3     0x204         0xcf338380         0xd0054ce0 aiodoned
 22            1 3     0x204         0xcf338520         0xd0051ce0 syncer
 21            1 3     0x204         0xcf3386c0         0xd004ece0 pgdaemon
 20            1 3     0x204         0xcf338860         0xd004bce0 raidiow
 19            1 3     0x204         0xcf338a00         0xd0048ce0 rfwcond
 18            1 3     0x204         0xcf338ba0         0xd0045ce0 raidiow
 17            1 3     0x204         0xcf338d40         0xd0042ce0 rfwcond
 16            1 3     0x204         0xcf331020         0xd003ace0 sccomp
 15            1 3     0x204         0xcf3311c0         0xd0033ce0 crypto_wait
 14            1 3     0x204         0xcf331360         0xd0030ce0 cardslotev
 13            1 3     0x204         0xcf331500         0xd002dce0 atath
 12            1 3     0x204         0xcf3316a0         0xd002ace0 atath
 11            1 3     0x204         0xcf331840         0xd0027ce0 atath
 10            1 3     0x204         0xcf3319e0         0xd0024ce0 atath
 9             1 3     0x204         0xcf331b80         0xd0021ce0 usbevt
 8             1 3     0x204         0xcf331d20         0xd001ece0 usbevt
 7             1 3     0x204         0xcf321000         0xd001bce0 usbtsk
 6             1 3     0x204         0xcf3211a0         0xd0018ce0 usbtsk
 5             1 3     0x204         0xcf321340         0xd0015ce0 usbevt
 4             1 3     0x204         0xcf3214e0         0xd0012ce0 iicintr
 3             1 3     0x204         0xcf321680         0xd000fce0 apmev
 2             1 3     0x204         0xcf321820         0xd000cce0 smtaskq
 1             1 3      0x84         0xcf3219c0         0xd0009ce0 wait
 0             3 1 0x80000205         0xcf321b60         0xcf355ce0
               2 7 0xa0000205         0xcf321d00         0xcf29bce0
               1 3     0x204         0xc086b860         0xc093cce0 schedule

>How-To-Repeat:
	This is on a SMP system with an Athlon 64 X2 cpu:

cpu0: AMD Dual-Core Opteron or Athlon 64 X2 (686-class), 2411.10 MHz, id 0x20f32
cpu0: "AMD Athlon(tm) 64 X2 Dual Core Processor 4600+"
cpu0: AMD Power Management features: f<TTP,VID,FID,TS>
cpu0: AMD Cool`n'Quiet Technology 2400 MHz
cpu0: available frequencies (Mhz): 1000 2400
cpu1 at mainbus0 apid 1: (application processor)
cpu1: AMD Dual-Core Opteron or Athlon 64 X2 (686-class), 2411.01 MHz, id 0x20f32
cpu1: "AMD Athlon(tm) 64 X2 Dual Core Processor 4600+"
cpu1: AMD Power Management features: f<TTP,VID,FID,TS>

	Here is a small set of some may 'relevant' kernel options:

options DIAGNOSTIC
options LOCKDEBUG
options DEBUG
options MULTIPROCESSOR
options MPDEBUG
options MPVERBOSE

no pseudo-device fss
no pseudo-device veriexec

	Execute the following, where it only happend once since I found the
	reason why the 2nd cpu wouldn't work:

	sync
	umount -a

	Panic

	I may have a crash dump of this panic. A dump was created after I tried
	the sync command a second time, because the first sync didn't work:

db{1}> sync
syncing disks... 
simple_lock: locking against myself
lock: 0xc07fa1e4, currently at: /src/sys/kern/vfs_syscalls.c:692
on CPU 1
last locked: /src/sys/kern/vfs_syscalls.c:630
last unlocked: /src/sys/kern/kern_lock.c:628
db_command_table(0,c366e000,c366e048,d6a1ece0,d6a1ec88) at netbsd:__qdivrem+0x26930
Stopped in pid 18146.1 (umount) at      netbsd:cpu_Debugger+0x4:        popl    %ebp

>Fix: