Subject: Re: Some more 'D' evidence.
To: Anders Magnusson <ragge@ludd.luth.se>
From: David Gilbert <dgilbert@jaywon.pci.on.ca>
List: port-sparc
Date: 04/08/1996 19:18:43
>>>>> "Anders" == Anders Magnusson <ragge@ludd.luth.se> writes:

>>  On Sun, 7 Apr 1996 10:45:34 -0400 (EDT) David Gilbert
>> <dgilbert@jaywon.pci.on.ca> wrote:
>> 
>> > I have found that if I get a process stuck in 'D', then I can >
>> produce more such processes by attempting to read the directory in
>> > which that process is attempting to read files.  I have been able
>> to > duplicate this where I had compiles running which I could
>> definately > track down.

Anders> We have had a couple of very similar problems. On a rather
Anders> loaded mailserver (> 8000 mail/day) here running NetBSD/sparc,
Anders> sendmail sometimes get hanging waiting on lockf, and is unable
Anders> to send some mail. (It is the outgoing mailqueue) It is on a
Anders> NFS-exported mailpartition, and the problem came up when NFSv3
Anders> were imported into the tree. Our solution was to compile a
Anders> sendmail that uses dot-locking instead of lockf.

	I have seen the 'D' hanging problem since before 1.1.  There
was a time in 1.0a that I didn't see it, but there isn't a definite
line that I can pin down, nor can I guarentee that it wasn't always a
problem since it's such a difficult bug to trigger.  It seems to be
best manufactured by high load.

	In my case, running an overnight build of the world combined
with the normal load associated with a reasonably large news feed.

Dave.

-- 
----------------------------------------------------------------------------
|David Gilbert, PCI, Richmond Hill, Ontario.  | Two things can only be     |
|Mail:      dgilbert@jaywon.pci.on.ca         |  equal if and only if they |
|http://www.pci.on.ca/~dgilbert               |   are precisely opposite.  |
---------------------------------------------------------GLO----------------