NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/39420: stopped processes can hold locks



>Number:         39420
>Category:       kern
>Synopsis:       stopped processes can hold locks
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Aug 27 20:45:01 +0000 2008
>Originator:     David A. Holland <dholland%eecs.harvard.edu@localhost>
>Release:        NetBSD 4.99.54 (20080229)
>Organization:
>Environment:
System: NetBSD tanaqui 4.99.54 NetBSD 4.99.54 (TANAQUI) #21: Fri Feb 29 
12:31:31 EST 2008 dholland@tanaqui:/usr/src/sys/arch/i386/compile/TANAQUI i386
Architecture: i386
Machine: i386
>Description:

Today we had a network hiccup that caused an nfs server to become
unreachable. After the network cleared, my browser remained stuck, and
I found it and also a leftover sync process were stuck in tstile:

32170  3994   602 2109  85  0 62176 96908 tstile   Dl   ttyp2  188:29.95 /usr/p
  163 18096 26111  513  85  0    20   476 tstile   D    ttyE2    0:00.01 sync 

Digging around, I discovered a second leftover sync process, this one
stopped by job control. (I think what happened was that I'd
absentmindedly typed sync, it hung, and I hit the usual ^Z^Z^C^C kind
of thing to try to get rid of it; then when the network came back, the
^Z got processed and it stopped.) Resuming this sync process cleared
the other two processes out of tstile.

This means that those processes must have been waiting on some lock or
other that was released when the stopped sync was continued and ran to
completion.

This in turn means that the stopped process was holding a lock.

This should not be allowed; stopped processes tend to remain stopped
for long periods of time, and it doesn't do for the system to grind to
a halt waiting for them.

>How-To-Repeat:

Not likely.

>Fix:

I've hit this before (it used to be easy to trigger it with vnode
locks) and dug around some, and I think the way processes stop may
need to be completely rewritten...



Home | Main Index | Thread Index | Old Index