NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/45128: tmpfs/tstile lockups



>Number:         45128
>Category:       kern
>Synopsis:       tmpfs/tstile lockups
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Jul 09 09:40:00 +0000 2011
>Originator:     Frank Kardel
>Release:        NetBSD 5.99.54 current 20110702190000
>Organization:
        
>Environment:
System: NetBSD pip.kardel.name 5.99.54 NetBSD 5.99.54 (PIPGEN) #0: Sat Jul 2 
21:02:48 MEST 2011 
kardel%pip.kardel.name@localhost:/fs/raid1a/src/NetBSD/cur/src/obj.amd64/sys/arch/amd64/compile/PIPGEN
 amd64
Architecture: x86_64
Machine: amd64
>Description:
        during bulk builds (pkgsrc/mk/bulk) using tmpfs for the build
        and MAKE_JOBS=9 I frequently get deadlocks in certain packages via
        tstile. LOCKDEBUG/DIAGNOSTIC/DEBUG kernel options do not catch anything.

        Involved processes:

    0 10489  9870 32217 100  20   5428   560 tstile  DN+  ttyp0  0:00.02 mv 
gsm_decode.o src/gsm_decode.o 
    0 10903 26669 32217 100  20   6576   716 tstile  DN+  ttyp0  0:00.00 
/tmp/pkgsrc/comms/asterisk/work/.gcc/bin/gcc -O2 
-I/tmp/pkgsrc/comms/asterisk/work/.buildlink/include -I/usr/include/krb5 
-Wformat-security -pipe -Wall -Wstrict-prototypes -Wmissing-prototypes 
-Wmissing-declarations -g3 -Iinclude -I../include -D_REENTRANT -D_GNU_SOURCE 
-pthread -fomit-frame-pointer -fPIC -c -DNeedFunctionPrototypes=1 
-funroll-loops -fPIC -DSASR -DNDEBUG -DWAV49 -I./inc src/table.c 
-L/tmp/pkgsrc/comms/asterisk/work/.buildlink/lib 
    0 12916 26405 32217 100  20  14928  7020 tstile  DN+  ttyp0  0:00.04 
/usr/libexec/cc1 -quiet -I/tmp/pkgsrc/comms/asterisk/work/.buildlink/include 
-I/usr/include/krb5 -Iinclude -I../include -I./inc -dD -D_REENTRANT -D_PTHREADS 
-D_REENTRANT -D_GNU_SOURCE -DNeedFunctionPrototypes=1 -DSASR -DNDEBUG -DWAV49 
src/gsm_option.c -quiet -dumpbase gsm_option.c -mtune=nocona -auxbase 
gsm_option -g3 -O2 -Wformat-security -Wall -Wstrict-prototypes 
-Wmissing-prototypes -Wmissing-declarations -fomit-frame-pointer -fPIC 
-funroll-loops -fPIC -o - 
    0 21005 21954 32217 100  20  13268  1536 tstile  DN+  ttyp0  0:00.00 
/usr/libexec/cc1 -quiet -I/tmp/pkgsrc/comms/asterisk/work/.buildlink/include 
-I/usr/include/krb5 -Iinclude -I../include -I. -dD -D_REENTRANT -D_PTHREADS 
-D_REENTRANT -D_GNU_SOURCE median.c -quiet -dumpbase median.c -mtune=nocona 
-auxbase-strip median.o -g3 -O2 -Wformat-security -Wall -Wstrict-prototypes 
-Wmissing-prototypes -Wmissing-declarations -Wall -Wno-comment -Wno-error 
-fomit-frame-pointer -fPIC -fPIC -o - 
    0 22505 29062 32217 100  20  14924  5064 tstile  DN+  ttyp0  0:00.04 
/usr/libexec/cc1 -quiet -I/tmp/pkgsrc/comms/asterisk/work/.buildlink/include 
-I/usr/include/krb5 -Iinclude -I../include -I./inc -dD -D_REENTRANT -D_PTHREADS 
-D_REENTRANT -D_GNU_SOURCE -DNeedFunctionPrototypes=1 -DSASR -DNDEBUG -DWAV49 
src/gsm_print.c -quiet -dumpbase gsm_print.c -mtune=nocona -auxbase gsm_print 
-g3 -O2 -Wformat-security -Wall -Wstrict-prototypes -Wmissing-prototypes 
-Wmissing-declarations -fomit-frame-pointer -fPIC -funroll-loops -fPIC -o - 
    0 22878 24879 32217 100  20  21288 13292 tstile  DN+  ttyp0  0:00.16 
/usr/libexec/cc1 -quiet -I/tmp/pkgsrc/comms/asterisk/work/.buildlink/include 
-I/usr/include/krb5 -Iinclude -I../include -I./inc -dD -D_REENTRANT -D_PTHREADS 
-D_REENTRANT -D_GNU_SOURCE -DNeedFunctionPrototypes=1 -DSASR -DNDEBUG -DWAV49 
src/gsm_encode.c -quiet -dumpbase gsm_encode.c -mtune=nocona -auxbase 
gsm_encode -g3 -O2 -Wformat-security -Wall -Wstrict-prototypes 
-Wmissing-prototypes -Wmissing-declarations -fomit-frame-pointer -fPIC 
-funroll-loops -fPIC -o - 
    0 24472 24104 32217 100  20  20232  4324 tstile  DN+  ttyp0  0:00.02 as -Qy 
-o short_term.o 
    0 24737 24760 32217 100  20  13268  1536 tstile  DN+  ttyp0  0:00.00 
/usr/libexec/cc1 -quiet -I/tmp/pkgsrc/comms/asterisk/work/.buildlink/include 
-I/usr/include/krb5 -Iinclude -I../include -I. -dD -D_REENTRANT -D_PTHREADS 
-D_REENTRANT -D_GNU_SOURCE mload.c -quiet -dumpbase mload.c -mtune=nocona 
-auxbase-strip mload.o -g3 -O2 -Wformat-security -Wall -Wstrict-prototypes 
-Wmissing-prototypes -Wmissing-declarations -Wall -Wno-comment -Wno-error 
-fomit-frame-pointer -fPIC -fPIC -o - 
    0 25617 24104 32217 100  20  14924  4316 tstile  DN+  ttyp0  0:00.02 
/usr/libexec/cc1 -quiet -I/tmp/pkgsrc/comms/asterisk/work/.buildlink/include 
-I/usr/include/krb5 -Iinclude -I../include -I./inc -dD -D_REENTRANT -D_PTHREADS 
-D_REENTRANT -D_GNU_SOURCE -DNeedFunctionPrototypes=1 -DSASR -DNDEBUG -DWAV49 
src/short_term.c -quiet -dumpbase short_term.c -mtune=nocona -auxbase 
short_term -g3 -O2 -Wformat-security -Wall -Wstrict-prototypes 
-Wmissing-prototypes -Wmissing-declarations -fomit-frame-pointer -fPIC 
-funroll-loops -fPIC -o - 
    0 27066  4534 32217 100  20  21224 12856 tstile  DN+  ttyp0  0:00.14 
/usr/libexec/cc1 -quiet -I/tmp/pkgsrc/comms/asterisk/work/.buildlink/include 
-I/usr/include/krb5 -Iinclude -I../include -I./inc -dD -D_REENTRANT -D_PTHREADS 
-D_REENTRANT -D_GNU_SOURCE -DNeedFunctionPrototypes=1 -DSASR -DNDEBUG -DWAV49 
src/gsm_implode.c -quiet -dumpbase gsm_implode.c -mtune=nocona -auxbase 
gsm_implode -g3 -O2 -Wformat-security -Wall -Wstrict-prototypes 
-Wmissing-prototypes -Wmissing-declarations -fomit-frame-pointer -fPIC 
-funroll-loops -fPIC -o - 

        Stack traces via crash:

Crash version 5.99.54, image version 5.99.54.
Output from a running system is unreliable.

trace: pid 10489 lid 1 at 0xffff800076877840
sleepq_block() at ffffffff804bfe36
turnstile_block() at ffffffff804cdfff
rw_enter() at ffffffff804bb184
genfs_lock() at ffffffff802ebb9e
VOP_LOCK() at ffffffff807a81d7
vn_lock() at ffffffff80797513
tmpfs_rename() at ffffffff806e331d
VOP_RENAME() at ffffffff807a850b
do_sys_rename() at ffffffff807903ae
syscall() at ffffffff806b927c

trace: pid 10903 lid 1 at 0xffff80006bdbe830
sleepq_block() at ffffffff804bfe36
turnstile_block() at ffffffff804cdfff
rw_enter() at ffffffff804bb184
genfs_lock() at ffffffff802ebb9e
VOP_LOCK() at ffffffff807a81d7
vn_lock() at ffffffff80797513
namei_tryemulroot() at ffffffff80789c88
namei() at ffffffff8078b3b1
sys_access() at ffffffff807916ed
syscall() at ffffffff806b927c

trace: pid 12916 lid 1 at 0xffff8000767ad630
sleepq_block() at ffffffff804bfe36
turnstile_block() at ffffffff804cdfff
rw_enter() at ffffffff804bb184
genfs_lock() at ffffffff802ebb9e
VOP_LOCK() at ffffffff807a81d7
vn_lock() at ffffffff80797513
cache_lookup() at ffffffff80785e88
tmpfs_lookup() at ffffffff806e2048
VOP_LOOKUP() at ffffffff807a9060
lookup_once() at ffffffff80789573
namei_tryemulroot() at ffffffff80789d2f
namei() at ffffffff8078b3b1
do_sys_stat() at ffffffff80791569
sys___stat50() at ffffffff8079160d
syscall() at ffffffff806b927c

trace: pid 21005 lid 1 at 0xffff800076816520
sleepq_block() at ffffffff804bfe36
turnstile_block() at ffffffff804cdfff
rw_enter() at ffffffff804bb184
genfs_lock() at ffffffff802ebb9e
VOP_LOCK() at ffffffff807a81d7
vn_lock() at ffffffff80797513
getcwd_common() at ffffffff80787700
vn_isunder() at ffffffff80787a46
lookup_once() at ffffffff8078943c
namei_tryemulroot() at ffffffff80789d2f
namei() at ffffffff8078b3b1
do_sys_stat() at ffffffff80791569
sys___stat50() at ffffffff8079160d
syscall() at ffffffff806b927c

trace: pid 22505 lid 1 at 0xffff800076a3f5b0
sleepq_block() at ffffffff804bfe36
turnstile_block() at ffffffff804cdfff
rw_enter() at ffffffff804bb184
genfs_lock() at ffffffff802ebb9e
VOP_LOCK() at ffffffff807a81d7
vn_lock() at ffffffff80797513
cache_lookup() at ffffffff80785e88
tmpfs_lookup() at ffffffff806e2048
VOP_LOOKUP() at ffffffff807a9060
lookup_once() at ffffffff80789573
namei_tryemulroot() at ffffffff80789d2f
namei() at ffffffff8078b3b1
vn_open() at ffffffff807981ed
sys_open() at ffffffff80792cc5
syscall() at ffffffff806b927c

trace: pid 22878 lid 1 at 0xffff80006be5c630
sleepq_block() at ffffffff804bfe36
turnstile_block() at ffffffff804cdfff
rw_enter() at ffffffff804bb184
genfs_lock() at ffffffff802ebb9e
VOP_LOCK() at ffffffff807a81d7
vn_lock() at ffffffff80797513
cache_lookup() at ffffffff80785e88
tmpfs_lookup() at ffffffff806e2048
VOP_LOOKUP() at ffffffff807a9060
lookup_once() at ffffffff80789573
namei_tryemulroot() at ffffffff80789d2f
namei() at ffffffff8078b3b1
do_sys_stat() at ffffffff80791569
sys___stat50() at ffffffff8079160d
syscall() at ffffffff806b927c

trace: pid 24472 lid 1 at 0xffff8000769af780
sleepq_block() at ffffffff804bfe36
turnstile_block() at ffffffff804cdfff
rw_enter() at ffffffff804bb184
genfs_lock() at ffffffff802ebb9e
VOP_LOCK() at ffffffff807a81d7
vn_lock() at ffffffff80797513
namei_tryemulroot() at ffffffff80789c88
namei() at ffffffff8078b3b1
do_sys_stat() at ffffffff80791569
sys___stat50() at ffffffff8079160d
syscall() at ffffffff806b927c

trace: pid 24737 lid 1 at 0xffff8000765dd520
sleepq_block() at ffffffff804bfe36
turnstile_block() at ffffffff804cdfff
rw_enter() at ffffffff804bb184
genfs_lock() at ffffffff802ebb9e
VOP_LOCK() at ffffffff807a81d7
vn_lock() at ffffffff80797513
getcwd_common() at ffffffff80787700
vn_isunder() at ffffffff80787a46
lookup_once() at ffffffff8078943c
namei_tryemulroot() at ffffffff80789d2f
namei() at ffffffff8078b3b1
do_sys_stat() at ffffffff80791569
sys___stat50() at ffffffff8079160d
syscall() at ffffffff806b927c

trace: pid 25617 lid 1 at 0xffff800075cf8700
sleepq_block() at ffffffff804bfe36
turnstile_block() at ffffffff804cdfff
rw_enter() at ffffffff804bb184
genfs_lock() at ffffffff802ebb9e
VOP_LOCK() at ffffffff807a81d7
vn_lock() at ffffffff80797513
namei_tryemulroot() at ffffffff80789c88
namei() at ffffffff8078b3b1
vn_open() at ffffffff807981ed
sys_open() at ffffffff80792cc5
syscall() at ffffffff806b927c

trace: pid 27066 lid 1 at 0xffff8000765e1630
sleepq_block() at ffffffff804bfe36
turnstile_block() at ffffffff804cdfff
rw_enter() at ffffffff804bb184
genfs_lock() at ffffffff802ebb9e
VOP_LOCK() at ffffffff807a81d7
vn_lock() at ffffffff80797513
cache_lookup() at ffffffff80785e88
tmpfs_lookup() at ffffffff806e2048
VOP_LOOKUP() at ffffffff807a9060
lookup_once() at ffffffff80789573
namei_tryemulroot() at ffffffff80789d2f
namei() at ffffffff8078b3b1
do_sys_stat() at ffffffff80791569
sys___stat50() at ffffffff8079160d
syscall() at ffffffff806b927c

>How-To-Repeat:
        set up pkgsrc bulk build, MAKE_JOBS=9, WORKOBJDIR=<tmpfs location>
        use MP machine, do bulk build (asterisk is a good candidate - not
        many packages exhibit this behavior)
        watch tstile deadlocking
>Fix:
        (I'll try without tmpfs)



Home | Main Index | Thread Index | Old Index