Subject: kern/11294: lockmgr: release of unlocked lock!
To: None <gnats-bugs@gnats.netbsd.org>
From: None <jarle@runit.no>
List: netbsd-bugs
Date: 10/23/2000 04:02:14
>Number:         11294
>Category:       kern
>Synopsis:       lockmgr: release of unlocked lock!
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Oct 23 04:02:00 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator:     Jarle Greipsland
>Release:        NetBSD-current 2000-10-21
>Organization:
RUNIT as
	
>Environment:
	
System: NetBSD honey.runit.no 1.5H NetBSD 1.5H (HONEY-$Revision: 1.130 $) #5: Sun Oct 22 15:33:58 CEST 2000     jarle@honey.runit.no:/usr/src/sys/arch/alpha/compile/HONEY alpha

>Description:
During heavy load on an LFS partition the system paniced with the messages:
lfs_fastvget: vnode VXLOCKed for ino 2105
panic: lockmgr: release of unlocked lock!
Stopped in bonnie at    cpu_Debugger+0x4:       ret     zero,(ra)
db>

I haven't seen this type of panic before.  Some time prior to the crash I
noticed that the lfs_cleanerd increased significantly in its resident set
size, from approximately 2-3MB to more than 30MB.  I don't know if this is
important or not, but I might as well mention it.

I extracted the following output from the kernel debugger:
db> trace
cpu_Debugger() at cpu_Debugger+0x4
panic() at panic+0xfc
lockmgr() at lockmgr+0xccc
vfs_unbusy() at vfs_unbusy+0x24
lfs_flush() at lfs_flush+0xd8
lfs_check() at lfs_check+0x198
lfs_read() at lfs_read+0x1c4
vn_read() at vn_read+0x118
dofileread() at dofileread+0xd0
sys_read() at sys_read+0xa0
syscall() at syscall+0x1c8
XentSys() at XentSys+0x50
--- syscall (3, netbsd.sys_read) ---
--- user mode ---
db>           
db> ps        
 PID             PPID       PGRP        UID S   FLAGS          COMMAND    WAIT
 264              210        264      16073 3  0x5086              top  select
 263              262        262      16073 3  0x4006           bonnie  getblk
 262              210        262      16073 3    0x86             bash    wait
 261              260        260      16073 3  0x4006              tar biowait
 260              210        260      16073 3    0x86             bash    wait
>How-To-Repeat:
Run three parallel instances (in separate subdirectories) of:
o repeatedly 'tar xf' and 'rm -rf' of a 1.5GB directory tree
o repeatedly run 'bonnie -s 1024'
on an LFS-based file system.
	
>Fix:
	
unknown to me.
>Release-Note:
>Audit-Trail:
>Unformatted:
 >252              251        251      16073 7  0x4006           bonnie
  251              210        251      16073 3    0x86             bash    wait
  250              249        249      16073 3  0x4006              tar lfs_dir
  249              210        249      16073 3    0x86             bash    wait
  248              245        248      16073 3  0x4186           systat   ttyin
  245              244        245      16073 3  0x4086             bash    wait
  244              241        241      16073 3  0x4184            xterm  select
  241              240        241      16073 3  0x4084              csh   pause
  240              192        192          0 3   0x184             sshd  select
  238              237        237      16073 3  0x4006           bonnie  getblk
  237              210        237      16073 3    0x86             bash    wait
  236              235        235      16073 3  0x4006              tar ufs_has
  235              210        235      16073 3    0x86             bash    wait
  223              222        222          0 2     0x4     lfs_cleanerd
  222                1        222          0 3    0x84     lfs_cleanerd    wait
  216              210        216          0 4  0x5006              csh
  210              209        210      16073 3  0x4086             bash    wait
  209              206        206      16073 3  0x4184            xterm  select
  206              205        206      16073 3  0x4084              csh   pause
  205              192        192          0 3   0x184             sshd  select
  204                1        204          0 3  0x4086            getty   ttyin
  202                1        202          0 3    0x84             cron nanosle
  199                1        199          0 3    0x84            inetd   pause
  192                1        192          0 3    0x84             sshd  select
  169                1        169          0 3    0x84             ntpd   pause
  122                1        122          0 3    0x84        mount_mfs  mfsidl
  111                1        111          0 3    0x84          rpcbind  select
  98                 1         98          0 3    0x84          syslogd  select
  4                  0          0          0 3 0x20204          ioflush  syncer
  3                  0          0          0 3 0x20204           reaper  reaper
  2                  0          0          0 3 0x20204       pagedaemon daemon_
  1                  0          1          0 3  0x4084             init    wait
  0                 -1          0          0 3 0x20204          swapper schedul
 db> 
 db> trace/t 0t263
 trace: pid 263 at 0xfffffe0005d8fb08
 mi_switch() at mi_switch+0x1a8
 ltsleep() at ltsleep+0x2d0
 getblk() at getblk+0xf0
 cluster_read() at cluster_read+0x88
 lfs_read() at lfs_read+0x1f0
 vn_read() at vn_read+0x118
 dofileread() at dofileread+0xd0
 sys_read() at sys_read+0xa0
 syscall() at syscall+0x1c8
 XentSys() at XentSys+0x50
 --- syscall (3, netbsd.sys_read) ---
 --- user mode ---
 db> 
 db> trace/t 0t261
 trace: pid 261 at 0xfffffe0005d87a18
 mi_switch() at mi_switch+0x1a8
 ltsleep() at ltsleep+0x2d0
 biowait() at biowait+0x7c
 bread() at bread+0x10c
 lfs_vget() at lfs_vget+0x150
 ufs_lookup() at ufs_lookup+0xd48
 lookup() at lookup+0x40c
 namei() at namei+0x448
 sys_lutimes() at sys_lutimes+0x48
 syscall() at syscall+0x1c8
 XentSys() at XentSys+0x50
 --- syscall (276, netbsd.sys_lutimes) ---
 --- user mode ---
 db> 
 db> trace/t 0t250
 trace: pid 250 at 0xfffffe0005d77bc8
 mi_switch() at mi_switch+0x1a8
 ltsleep() at ltsleep+0x2d0
 lfs_set_dirop() at lfs_set_dirop+0xfc
 lfs_symlink() at lfs_symlink+0x24
 sys_symlink() at sys_symlink+0x210
 syscall() at syscall+0x1c8
 XentSys() at XentSys+0x50
 --- syscall (57, netbsd.sys_symlink) ---
 --- user mode ---
 db> 
 db> trace/t 0t238
 trace: pid 238 at 0xfffffe0005d5bb08
 mi_switch() at mi_switch+0x1a8
 ltsleep() at ltsleep+0x2d0
 getblk() at getblk+0xf0
 cluster_read() at cluster_read+0x88
 lfs_read() at lfs_read+0x1f0
 vn_read() at vn_read+0x118
 dofileread() at dofileread+0xd0
 sys_read() at sys_read+0xa0
 syscall() at syscall+0x1c8
 XentSys() at XentSys+0x50
 --- syscall (3, netbsd.sys_read) ---
 --- user mode ---
 db> 
 db> trace/t 0t236
 trace: pid 236 at 0xfffffe0005d53a28
 mi_switch() at mi_switch+0x1a8
 ltsleep() at ltsleep+0x2d0
 lockmgr() at lockmgr+0x978
 lfs_vget() at lfs_vget+0xdc
 ufs_lookup() at ufs_lookup+0xd48
 lookup() at lookup+0x40c
 namei() at namei+0x448
 sys_utimes() at sys_utimes+0x4c
 syscall() at syscall+0x1c8
 XentSys() at XentSys+0x50
 --- syscall (138, netbsd.sys_utimes) ---
 --- user mode ---
 db> 
 db> trace/t 0t223
 trace: pid 223 at 0xfffffe0005d47ca8
 mi_switch() at mi_switch+0x1a8
 ltsleep() at ltsleep+0x2d0
 lfs_seglock() at lfs_seglock+0x9c
 sys_lfs_markv() at sys_lfs_markv+0x15c
 syscall() at syscall+0x1c8
 XentSys() at XentSys+0x50
 --- syscall (185, netbsd.sys_lfs_markv) ---
 --- user mode ---
 db> 
 
 The system dmesg output is:
 [ using 275560 bytes of netbsd ELF symbol table ]
 Copyright (c) 1996, 1997, 1998, 1999, 2000
     The NetBSD Foundation, Inc.  All rights reserved.
 Copyright (c) 1982, 1986, 1989, 1991, 1993
     The Regents of the University of California.  All rights reserved.
 
 NetBSD 1.5H (HONEY-$Revision: 1.130 $) #5: Sun Oct 22 15:33:58 CEST 2000
     jarle@honey.runit.no:/usr/src/sys/arch/alpha/compile/HONEY
 Digital AlphaPC 164 500 MHz, s/n 
 8192 byte page size, 1 processor.
 total memory = 128 MB
 (2472 KB reserved for PROM, 125 MB used by NetBSD)
 avail memory = 112 MB
 using 816 buffers containing 6528 KB of memory
 mainbus0 (root)
 cpu0 at mainbus0: ID 0 (primary), 21164A-2
 cpu0: Architecture extensions: 1<BWX>
 cia0 at mainbus0: DECchip 2117x Core Logic Chipset (ALCOR/ALCOR2), pass 3
 cia0: extended capabilities: 21<DWEN,BWEN>
 cia0: using BWX for PCI config access
 pci0 at cia0 bus 0
 pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
 isp0 at pci0 dev 5 function 0
 isp0: interrupting at eb164 irq 2
 isp0: Ultra Mode Capable
 isp0: Board Revision 1040B, loaded F/W Revision 4.65.0
 scsibus0 at isp0: 16 targets, 8 luns per target
 ncr0 at pci0 dev 7 function 0: ncr 53c860 fast20 scsi
 ncr0: interrupting at eb164 irq 1
 scsibus1 at ncr0: 8 targets, 8 luns per target
 sio0 at pci0 dev 8 function 0: Intel 82378ZB System I/O (SIO) (rev. 0x43)
 tlp0 at pci0 dev 9 function 0: DECchip 21140 Ethernet, pass 1.2
 tlp0: interrupting at eb164 irq 3
 tlp0: DEC DE500-XA, Ethernet address 00:00:f8:30:94:81
 tlp0: 10baseT, 100baseTX, 100baseTX-FDX, 10baseT-FDX
 pciide0 at pci0 dev 11 function 0: CMD Technology PCI0646 (rev. 0x01)
 pciide0: bus-master DMA support present
 pciide0: primary channel wired to compatibility mode
 wd0 at pciide0 channel 0 drive 0: <IBM-DJNA-370910>
 wd0: drive supports 16-sector pio transfers, lba addressing
 wd0: 8693 MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 17803440 sectors
 wd0: 32-bit data port
 wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 4
 pciide0: primary channel interrupting at isa irq 14
 wd0(pciide0:0:0): using PIO mode 4, DMA mode 2 (using DMA data transfers)
 pciide0: secondary channel wired to compatibility mode
 pciide0: disabling secondary channel (no drives)
 isa0 at sio0
 com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
 com0: console
 com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
 lpt0 at isa0 port 0x3bc-0x3bf irq 7
 pcppi0 at isa0 port 0x61
 spkr0 at pcppi0
 isabeep0 at pcppi0
 fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
 fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
 mcclock0 at isa0 port 0x70-0x71: mc146818 or compatible
 scsibus0: waiting 2 seconds for devices to settle...
 sd0 at scsibus0 target 0 lun 0: <IBM, DDRS-39130W, S97B> SCSI2 0/direct fixed
 sd0: 8715 MB, 8387 cyl, 10 head, 212 sec, 512 bytes/sect x 17850000 sectors
 scsibus1: waiting 2 seconds for devices to settle...
 sd1 at scsibus1 target 0 lun 0: <DEC, RZ29B    (C) DEC, 0016> SCSI2 0/direct fix
 ed
 sd1(ncr0:0:0): 10.0 MB/s (100 ns, offset 8)
 sd1: 4091 MB, 3708 cyl, 20 head, 113 sec, 512 bytes/sect x 8380080 sectors
 sd2 at scsibus1 target 2 lun 0: <DEC, RZ29B    (C) DEC, 0009> SCSI2 0/direct fix
 ed
 sd2(ncr0:2:0): 10.0 MB/s (100 ns, offset 8)
 sd2: 4091 MB, 3708 cyl, 20 head, 113 sec, 512 bytes/sect x 8380080 sectors
 sd3 at scsibus1 target 4 lun 0: <DEC, RZ29B    (C) DEC, 0014> SCSI2 0/direct fix
 ed
 sd3(ncr0:4:0): 10.0 MB/s (100 ns, offset 8)
 sd3: 4091 MB, 3708 cyl, 20 head, 113 sec, 512 bytes/sect x 8380080 sectors
 IPsec: Initialized Security Association Processing.
 root on wd0a dumps on wd0b
 root file system type: ffs
 
 # df -l
 Filesystem       1K-blocks     Used     Avail Capacity  Mounted on
 /dev/wd0a            49711    33055     14170      69%    /
 /dev/wd0f           507750   142081    340281      29%    /var
 /dev/wd0d          1485749   282792   1128669      20%    /usr
 /dev/wd0g          6059635        3   5756650       0%    /local
 kernfs                   1        1         0     100%    /kern
 mfs:122              62383        6     59257       0%    /tmp
 /dev/sd0a          8644911  6186040   2026625      75%    /home
 /home/usr/src      8644911  6186040   2026625      75%    /usr/src
 /home/usr/xsrc     8644911  6186040   2026625      75%    /usr/xsrc
 /home/usr/pkg      8644911  6186040   2026625      75%    /usr/pkg
 /home/usr/pkgsrc   8644911  6186040   2026625      75%    /usr/pkgsrc
 /dev/ccd0a        11340160    15851  10190293       0%    /spare
 /spare/usr/obj    11340160    15851  10190293       0%    /usr/obj
 # cat /etc/ccd.conf
 # ccd           ileave  flags   component devices
 ccd0            113      none    /dev/sd1a /dev/sd2a /dev/sd3a