NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/54174: Unkillable process got stuck in uao_put() when running textproc/the_silver_searcher on tmpfs



>Number:         54174
>Category:       kern
>Synopsis:       Unkillable process got stuck in uao_put() when running textproc/the_silver_searcher on tmpfs
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue May 07 18:55:00 +0000 2019
>Originator:     Leonardo Taccari
>Release:        NetBSD 8.99.38
>Organization:
Università Politecnica delle Marche
>Environment:
System: NetBSD abacus 8.99.38 NetBSD 8.99.38 (GENERIC) #1: Mon May 6 14:13:15 CEST 2019 leot@abacus:/usr/obj/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:

        When trying to grep source code via ag (part of
        textproc/the_silver_searcher) on a tmpfs the ag process get
	stuck in uao_put() and become unkillable.

        The only reliable way to reproduce that, that I found is by
	using the libvirt-5.2.0 distfile (available in
        <https://libvirt.org/sources/libvirt-5.2.0.tar.xz>) and - if
	not get stuck the first time - rerun `ag' several times
	(`repeat 100 ag std' seems enough to trigger that).


>How-To-Repeat:

        After fetching libvirt-5.2.0.tar.xz distfile, e.g. in /tmp:

        % fgrep /tmp /etc/fstab
        tmpfs           /tmp    tmpfs   rw,-m=1777,-s=16G
        % cd /tmp
        % bsdtar xJf libvirt-5.2.0.tar.xz
        % cd libvirt-5.2.0/
        % repeat 100 ag std
        [...]
        tests/qemuxml2xmloutdata/hostdev-vfio-zpci-autogenerate.xml
        18:    <hostdev mode='subsystem' type='pci' managed='no'>
        26:    </hostdev>

        tests/qemuxml2xmloutdata/hostdev-vfio-zpci.xml
        18:    <hostdev mode='subsystem' type='pci' managed='no'>
        26:    </hostdev>

        tests/qemustatusxml2xmldata/migration-out-nbd-out.xml
        260:  <chardevStdioLogd/>

        tests/qemustatusxml2xmldata/migration-out-nbd-in.xml
        260:  <chardevStdioLogd/>
        [...ag is now stuck, pressing ^T...]
        [ 473.1116407] load: 2.55  cmd: ag 803 [tstile tstile uao_put tstile tstile tstile tstile parked] 0.78u 0.67s 0% 4152k


        Corresponding crash(8) `ps` output:

        # crash
        Crash version 8.99.38, image version 8.99.38.
        Output from a running system is unreliable.
        crash> ps | grep 803
        803      8 3   1         0   ffffe27ec5eea260                 ag tstile
        803      7 3   5         0   ffffe27ec542c180                 ag tstile
        803      6 3   6         0   ffffe27ec55d95c0                 ag uao_put
        803      5 3   7         0   ffffe27ec4efcb80                 ag tstile
        803      4 3   0         0   ffffe27ec1598640                 ag tstile
        803      3 3   1         0   ffffe27ec5cfc080                 ag tstile
        803      2 3   0         0   ffffe27ec5122040                 ag tstile
        803      1 3   1        80   ffffe27ed9fac940                 ag parked

        ...and their traces:

        # echo ps | crash | awk '$1 == 803 { print "bt/a " $6 }' | crash
        Crash version 8.99.38, image version 8.99.38.
        Output from a running system is unreliable.
        trace: pid 803 lid 8 at 0xffffbe01cc148c20
        sleepq_block() at sleepq_block+0xb8
        turnstile_block() at turnstile_block+0x4f8
        rw_vector_enter() at rw_vector_enter+0x213
        uvm_fault_internal() at uvm_fault_internal+0x118
        trap() at trap+0x358
        --- trap (number 6) ---
        409248:
        trace: pid 803 lid 7 at 0xffffbe01cc25ae20
        sleepq_block() at sleepq_block+0xb8
        turnstile_block() at turnstile_block+0x4f8
        rw_vector_enter() at rw_vector_enter+0x213
        vm_map_lock() at vm_map_lock+0x66
        sys_munmap() at sys_munmap+0x58
        syscall() at syscall+0x188
        --- syscall (number 73) ---
        71707bf9a26a:
        trace: pid 803 lid 6 at 0xffffbe01cc0e4b00
        sleepq_block() at sleepq_block+0xb8
        mtsleep() at mtsleep+0x149
        uao_put() at uao_put+0x268
        VOP_PUTPAGES() at VOP_PUTPAGES+0x53
        uvm_fault_internal() at uvm_fault_internal+0x1506
        trap() at trap+0x358
        --- trap (number 6) ---
        408e94:
        trace: pid 803 lid 5 at 0xffffbe01cc31cc20
        sleepq_block() at sleepq_block+0xb8
        turnstile_block() at turnstile_block+0x4f8
        rw_vector_enter() at rw_vector_enter+0x213
        uvm_fault_internal() at uvm_fault_internal+0x1838
        trap() at trap+0x358
        --- trap (number 6) ---
        408e94:
        trace: pid 803 lid 4 at 0xffffbe01cc29bc20
        sleepq_block() at sleepq_block+0xb8
        turnstile_block() at turnstile_block+0x4f8
        rw_vector_enter() at rw_vector_enter+0x213
        uvm_fault_internal() at uvm_fault_internal+0x1838
        trap() at trap+0x358
        --- trap (number 6) ---
        409248:
        trace: pid 803 lid 3 at 0xffffbe01cc1e2e20
        sleepq_block() at sleepq_block+0xb8
        turnstile_block() at turnstile_block+0x4f8
        rw_vector_enter() at rw_vector_enter+0x213
        vm_map_lock() at vm_map_lock+0x66
        sys_munmap() at sys_munmap+0x58
        syscall() at syscall+0x188
        --- syscall (number 73) ---
        71707bf9a26a:
        trace: pid 803 lid 2 at 0xffffbe01cc1c0c40
        sleepq_block() at sleepq_block+0xb8
        turnstile_block() at turnstile_block+0x4f8
        rw_vector_enter() at rw_vector_enter+0x213
        vm_map_lock() at vm_map_lock+0x66
        uvm_map_prepare() at uvm_map_prepare+0xb8
        uvm_map() at uvm_map+0x53
        uvm_mmap.part.0() at uvm_mmap.part.0+0x247
        sys_mmap() at sys_mmap+0x28b
        syscall() at syscall+0x188
        --- syscall (number 197) ---
        71707bf9c97a:
        trace: pid 803 lid 1 at 0xffffbe01cbccaed0
        sleepq_block() at sleepq_block+0xb8
        lwp_park() at lwp_park+0x117
        sys____lwp_park60() at sys____lwp_park60+0x5a
        syscall() at syscall+0x188
        --- syscall (number 478) ---
        71707beb3d7a:

>Fix:

        No idea, sorry!



Home | Main Index | Thread Index | Old Index