Subject: Re: easy ways to crash your NetBSD system
To: Brett Lymn <blymn@awadi.com.au>
From: Jukka Marin <jmarin@teeri.jmp.fi>
List: current-users
Date: 04/08/1996 11:26:51
> >Wouldn't it be a good idea to kill the most recently started, largest
> >user processes in case the system runs out of swap?  This would protect
> >the system processes like init and inetd.
> >
> 
> Ummm only if you can arrange for the bad sector on the disk to be
> mapped to that last process run ;-)  Things don't work like that - if
> the swapper managed to be convinced the data hit the disk but when it
> goes to get it back it finds it cannot what do you do?  return null?
> return an error that makes the process die?  OK, so if you do the last
> then what happens?  do you have a mechanism for locking that disk
> block out of the swap pool?  otherwise the block will get used
> somewhere else causing more problems - sort of a roving/random process
> killer.

Uh, are you talking about failing disks or other _hardware_ problems?
I can well understand that with malfunctioning hardware, the system is
likely to crash, not even panic (it doesn't have 16-bit CRC's for every
disk block used for swap, does it?), but how about the situations of
running out of swap etc.?  Surely the system knows when it can no longer
put new pages on disk - it could start killing processes then.  And if we
had a safe margin so that the user processes could never use up _all_ swap,
we'd have little room for the system processes to live even when swap is
almost full (a bit like our filesystem can become "110% full" only when
root is writing to it).

> This has become very focussed on just handling bad swap - the start of
> this thread was someone claiming that panic'ing was taking the easy
> way out.

Is bad swap == running out of swap?  Maybe I haven't got a clue.. :-/

> I still instist that there are situations where it is better
> to have the machine go toes up in no uncertain manner than try to cope
> with something that is hopelessly broken only to limp on into more
> damage.  This is what a panic is for.... if in danger or in doubt, run
> in circles, scream and shout :-)

I agree, but maybe some panics could be avoided if the system was trying
to protect itself a bit harder?

  -jm