NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/40389: page busy vs. glock deadlock



>Number:         40389
>Category:       kern
>Synopsis:       page busy vs. glock deadlock
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Jan 13 17:25:00 +0000 2009
>Originator:     Antti Kantee
>Release:        
>Organization:
>Environment:
>Description:
Simon reported a problem where he's writing to a file (log of build -j4)
and repeatedly doing tail -5000.  He says that this causes hangs sooner
or later.

Upon examination, it seems that writing to the log goes through
ufs_balloc_range(), which busies the new pages it wants to enlarge the
file to and then takes the genfs node lock to do actual block allocation.
Meanwhile, tail coming in through mmap + genfs_getpages and holding the
genfs lock tries to uvn_findpages.
==> deadlock between PG_BUSY and glock

This can happen because the file is already large enough to contain
partially valid data in the "new" page that ufs_balloc_range() is trying
to enlarge to i.e. old file size is not a multiple of page size.

from simon:
I can reproduce a case where nbmake has it's output redirected to a
file, and gets blocked in "tstile" with this backtrace (10 finger
cut'n'paste):

        sleepq_block
        turnstile_block
        rw_vector_enter
        genfs_node_wrlock
        ufs_balloc_range
        ffs_write
        VOP_WRITE
        vn_write
        dofilewrite
        sys_write
        syscall

and tail is blocked in "uvn_fp2" with this backtrace:

        sleepq_block
        mtsleep
        uvn_findpage
        uvn_findpages
        genfs_getpages
        VOP_GETPAGES
        uvn_get
        uvm_fault_internal
        trap

>How-To-Repeat:
build -j4 and constantly do tail -5000 in a loop
(and maybe be Simon)
>Fix:



Home | Main Index | Thread Index | Old Index