Subject: re: sleep forever bug - not fixed :-(
To: john heasley <heas@shrubbery.net>
From: matthew green <mrg@eterna.com.au>
List: port-sparc64
Date: 11/07/2004 11:39:21
   Sat, Nov 06, 2004 at 05:12:58PM +1100, matthew green:
   >    if the processor were stuck in a trap handler, say data miss (fill, spill,
   >    ...), that'd be at a higher priority than a interrupt like hard/soft clock,
   >    so it'd never be serviced.  right?
   > 
   > 
   > yeah. but then all/most other processing would stop as well?
   > i'm still quite fine with most of my interactive shells and
   > NFS is working fine still....
   
   hrm, I misunderstood.  so, it is only the "selected process" that "sleeps
   forever".
   
   I thought that some folks had reported that their only recovery path
   was power-cycling, meaning that the machine had become totally
   unresponsive.  so, I must be muddling 2 problems.
   
   I guess there are 3 mortally wounding problems.  sleeps forever, pmap seg 0
   (PR 24126), and this hanging.


right.  "sleep forever" just means that anything that depends on
softclock() stops working.  that means sleep(), select()/poll(),
TCP timeouts (so connections are OK as long as nothing gets lost),
and apparently something in the NFS server did eventually :-)  it
means timeouts don't happen in the kernel, so eg your NAT tables
will grow and grow and grow...