Subject: Re: memory (re-)allocation woes
To: None <tech-kern@NetBSD.org>
From: theo borm <theo4490@borm.org>
List: tech-kern
Date: 11/30/2004 11:17:05
Matthew Orgass wrote:

>On 2004-11-28 theo4490@borm.org wrote:
>
>  
>
>>I have run my little test on a variety of hardware (i.e. different
>>diskeless cluster nodes with /identical/ hardware and four other
>>machines with different hardware, alas all i386, and have managed to
>>crash all diskless nodes (somehow swap over NFS seems to be quite
>>sensitive to long delays in the pagedaemon), and two of the four other
>>machines. One of these I have used some time ago to build all in pkgsrc
>>*). To me it seems as if the problem is not hardware related, and that
>>it is only a matter finding the correct parameters to reboot them too
>>:-(
>>    
>>
>
>  There does seem to be a problem.  I just remembered kern/9308.  I was
>not able to reproduce that problem on my current (fast) i386 machine, but
>I did get a hang on my Clio 1050 (hpcmips) running -current (possibly a
>month old).  To get this I ran "grep foo /dev/zero & && grep foo /dev/zero
>&" then kept pushing up and return until it hung (maybe 10-15 times).  I
>also don't use swap now, and I think I did at the time of the PR.  When
>breaking into ddb, I now see a variety of uvm related functions, but
>apparently no progress is made (I waited about five minutes).  It may be
>that NFS swap would increase the chance of the race condition being hit
>(however, I don't have any idea what is really going on).  It is also
>possible that your problem may be NFS related.  I guess either way it
>could potentially cause reboots.  Can you break into ddb several times
>while it is hung to see what functions are using up the CPU?
>
>
>  
>
Well, Ive been running some tests on my diskless cluster nodes, and have
in the process managed to crash my NFS server (ctrl-alt-esc didn't even
work!). I'll be setting up a few nodes with their own NFS server for testing
purposes to limit interruptions in normal work flow and to limit the number
of variables I have to deal with.... This may take some time, so please bear
with me....

As a minor b.t.w., the GENERIC kernel shipping with 1.6.2 /was/ compiled
from the sources and (native?) tools shipping with it? I'm having some
trouble compiling a functionally equivalent GENERIC kernel...

with kind regards,
Theo.

>Matthew Orgass
>darkstar@city-net.com
>  
>