NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/41417: WAPBL: hang on tstile



>Number:         41417
>Category:       kern
>Synopsis:       WAPBL: hang on tstile
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue May 12 21:10:00 +0000 2009
>Originator:     Manuel Bouyer
>Release:        NetBSD 5.0_STABLE
>Organization:
>Environment:
System: NetBSD ftp.lip6.fr 5.0_STABLE NetBSD 5.0_STABLE (FTP) #3: Tue May 12 
21:09:27 CEST 2009 
bouyer@roll:/dsk/l1/misc/bouyer/tmp/amd64/obj/dsk/l1/misc/bouyer/netbsd-5/src/sys/arch/amd64/compile/FTP
 amd64
Architecture: x86_64
Machine: amd64
>Description:
        This system is a ftp/http server with 12GB of RAM and a 3.5TB
        FFSv2 fileysstem for data to serve. I tried mounting this 3.5TB FFSv2
        -o log, but then running
rsync -avH --delete --delete-excluded --delete-after --delay-updates 
--max-delete=5000 --force --stats --partial ...
        to sync local data from the master hang in tstile state. Any process
        accessing this working directory (e.g. ls -l) also hang on tstile.
        Mounting the same filesystem without -o log doens't cause this.

        Here's what I've been able to collect from ddb after typing
        'reboot' in a shell (the reboot also did hang):

db{0}> ps /l
PID    LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
1862     1 3   1         4   ffff80007f91c020             reboot tstile
2768     1 3   1   9020004   ffff8000787727c0                 ls tstile
1567     1 3   0   9020004   ffff8000799abbe0              rsync tstile
1252     1 3   1   9020004   ffff80007dfca420              rsync tstile
1        1 3   0   8020084   ffff800056cdf400               init wait
0       50 3   0       204   ffff80007d043020            physiod physiod
              49 3   0       204   ffff80007c7513e0        vmem_rehash vmem_reha
sh
              48 3   1       204   ffff80007c7517c0           aiodoned aiodoned
              47 3   1       204   ffff80007c751ba0            ioflush tstile
              46 3   0       204   ffff80007c749040           pgdaemon pgdaemon
              45 3   0       204   ffff800056cde7c0            raidio2 raidiow
              44 3   0       204   ffff800056cde3e0              raid2 rfwcond
              43 3   0       204   ffff800056cdd420          cryptoret crypto_wa
it
              42 3   1       204   ffff800056cde000               ipmi ipmi_poll

              41 3   1       204   ffff80007c749420              ipmi0 ipmi0
              40 3   1       204   ffff800056cdeba0             sysmon smtaskq
              39 3   1       204   ffff800056cdfbc0         usbtask-dr usbtsk
              38 3   1       204   ffff800056cdf7e0         usbtask-hc usbtsk
              37 3   1       204   ffff800056cdf020               usb0 usbevt
              36 3   1       204   ffff80007c749800          atapibus1 sccomp
              34 3   1       204   ffff80007c749be0          atapibus0 sccomp
              32 3   1       204   ffff800056cdd040              unpgc unpgc
              23 3   1       204   ffff800056cdd800            atabus3 atath
              22 3   1       204   ffff800056cddbe0            atabus2 atath
              21 3   0       204   ffff800056cda020            atabus1 atath
              20 3   1       204   ffff800056cda400            atabus0 atath
              19 3   0       204   ffff800056cda7e0           scsibus1 sccomp
              18 3   0       204   ffff800056cdabc0           scsibus0 sccomp
              17 3   1       204   ffff800056cd8000            xcall/1 xcall
              16 1   1       204   ffff800056cd83e0          softser/1
              15 1   1       204   ffff800056cd87c0          softclk/1
              14 1   1       204   ffff800056cd8ba0          softbio/1
              13 1   1       204   ffff800056ccf040          softnet/1
           >  12 7   1       205   ffff800056ccf420             idle/1
              11 3   0       204   ffff800056ccf800           pmfevent pmfevent
              10 3   0       204   ffff800056ccfbe0           nfssilly nfssilly
               9 3   1       204   ffff800056ccc020            cachegc cachegc
               8 3   1       204   ffff800056ccc400              vrele vrele
               7 3   0       204   ffff800056ccc7e0            xcall/0 xcall
               6 1   0       204   ffff800056cccbc0          softser/0
               5 1   0       204   ffff800056cca000          softclk/0
               4 1   0       204   ffff800056cca3e0          softbio/0
               3 1   0       204   ffff800056cca7c0          softnet/0
           >   2 7   0       205   ffff800056ccaba0             idle/0
               1 3   0       204   ffffffff806c8d00            swapper schedule


db{0}> tr/a ffff80007f91c020
trace: pid 1862 lid 1 at 0xffff800078e83830
sleepq_block() at netbsd:sleepq_block+0xec
turnstile_block() at netbsd:turnstile_block+0x2bb
rw_vector_enter() at netbsd:rw_vector_enter+0x1f9
vlockmgr() at netbsd:vlockmgr+0xf6
VOP_LOCK() at netbsd:VOP_LOCK+0x64
vclean() at netbsd:vclean+0x8c
vflush() at netbsd:vflush+0x1b4
ffs_flushfiles() at netbsd:ffs_flushfiles+0x57
ffs_unmount() at netbsd:ffs_unmount+0x57
VFS_UNMOUNT() at netbsd:VFS_UNMOUNT+0x2e
dounmount() at netbsd:dounmount+0x14b
vfs_unmountall() at netbsd:vfs_unmountall+0x55
cpu_reboot() at netbsd:cpu_reboot+0xc2
sys_reboot() at netbsd:sys_reboot+0x5f
syscall() at netbsd:syscall+0xb6
db{0}> tr/a ffff8000787727c0
trace: pid 2768 lid 1 at 0xffff8000788435e0
sleepq_block() at netbsd:sleepq_block+0xec
turnstile_block() at netbsd:turnstile_block+0x2bb
rw_vector_enter() at netbsd:rw_vector_enter+0x1f9
vlockmgr() at netbsd:vlockmgr+0xf6
VOP_LOCK() at netbsd:VOP_LOCK+0x64
vn_lock() at netbsd:vn_lock+0xd9
vget() at netbsd:vget+0x132
ufs_ihashget() at netbsd:ufs_ihashget+0x91
ffs_vget() at netbsd:ffs_vget+0xc1
ufs_lookup() at netbsd:ufs_lookup+0x7cc
VOP_LOOKUP() at netbsd:VOP_LOOKUP+0x80
lookup() at netbsd:lookup+0x34b
namei() at netbsd:namei+0x1a4
do_sys_stat() at netbsd:do_sys_stat+0x44
sys___lstat30() at netbsd:sys___lstat30+0x2a
syscall() at netbsd:syscall+0xb6
db{0}> tr/a ffff8000799abbe0
trace: pid 1567 lid 1 at 0xffff800079992710
sleepq_block() at netbsd:sleepq_block+0xec
turnstile_block() at netbsd:turnstile_block+0x2bb
rw_vector_enter() at netbsd:rw_vector_enter+0x1f9
vlockmgr() at netbsd:vlockmgr+0xf6
VOP_LOCK() at netbsd:VOP_LOCK+0x64
vn_lock() at netbsd:vn_lock+0xd9
wapbl_ufs_rename() at netbsd:wapbl_ufs_rename+0x5ab
ufs_rename() at netbsd:ufs_rename+0x39
VOP_RENAME() at netbsd:VOP_RENAME+0x75
do_sys_rename() at netbsd:do_sys_rename+0x57d
syscall() at netbsd:syscall+0xb6
db{0}> tr/a ffff80007dfca420
trace: pid 1252 lid 1 at 0xffff80007dfe3650
sleepq_block() at netbsd:sleepq_block+0xec
turnstile_block() at netbsd:turnstile_block+0x2bb
rw_vector_enter() at netbsd:rw_vector_enter+0x1f9
vlockmgr() at netbsd:vlockmgr+0xf6
VOP_LOCK() at netbsd:VOP_LOCK+0x64
vn_lock() at netbsd:vn_lock+0xd9
cache_lookup() at netbsd:cache_lookup+0x201
ufs_lookup() at netbsd:ufs_lookup+0xc6
VOP_LOOKUP() at netbsd:VOP_LOOKUP+0x80
lookup() at netbsd:lookup+0x34b
namei() at netbsd:namei+0x1a4
do_sys_stat() at netbsd:do_sys_stat+0x44
sys___lstat30() at netbsd:sys___lstat30+0x2a
syscall() at netbsd:syscall+0xb6
db{0}> tr/a ffff80007c751ba0
trace: pid 0 lid 47 at 0xffff80007cfbbb30
sleepq_block() at netbsd:sleepq_block+0xec
turnstile_block() at netbsd:turnstile_block+0x2bb
mutex_vector_enter() at netbsd:mutex_vector_enter+0x339
sched_sync() at netbsd:sched_sync+0x27

>How-To-Repeat:
        See above. This may be related to running rsync with the special 
options        above. it's also possible that this is related to rsync doing
        partial file updates (instead of whole file transfers).
>Fix:
        workaround: don't mount -o log ...



Home | Main Index | Thread Index | Old Index