Subject: Re: kernel stack overflow
To: None <port-sparc@NetBSD.ORG>
From: Chris Torek <torek@BSDI.COM>
List: port-sparc
Date: 02/01/1997 12:02:36
>> A double error halt ("Watchdog reset!" on machines with older ROMs) is
>> what happens when the machine takes a trap while traps are disabled. ...
>> (If the kernel stack is invalid, this is probably what will happen ...

>Since this problem needs to be handled for user window traps where the
>destination address is not mapped in or swapped out, the sparc v8- port
>should be perfectly able to deal with this condition with a small tweek to
>the window trap handlers.

This is not as easy as you might think.

When you get a window overflow and need to save the window to the
stack, you must:

	0. save CPU state
	1. decide WHICH stack (user or kernel <=> user or kernel window)
	2. maybe-probe
	3. save
	4. restore CPU state and return

Currently, the system omits step 2 for the kernel stack.  One reason
for this is that the `probe' sequence can (i.e., `does' :-) ) depend
on PTE bits that are not necessarily set the same by the PROM as
by the kernel.  The probe is also expensive in terms of CPU time
(though it could, of course, be done only #ifdef whatever).  So,
to add kernel checks, you would have to start them only when the
VM system has gotten far enough along in the bootstrap process.

However, this is not quite the right thing anyway:

>I beleive what the current handler checks to see if the location the stack
>pointer points to is mapped in before attempting to save the registers.
>If not, then it saves the window in the u-structure and issues a data
>fault.  All we need to do to handle a kernel stack redzone is if the
>window fault was to a kernel address and the CPU was in kernel mode,
>switch to an emergency backup kernel stack, then panic.

For red-zone checking, you would want to do what I already do in
my kernel: have a per-process software red-zone pointer.  If the
kernel stack pointer becomes too small (kernel stacks grow down),
do the switch and panic.  If you do this *before* the stack pointer
would actually be invalid, you catch most cases.  (The ones you
still miss are those where someone allocates a `really big stack
frame' and slams entirely *past* the `struct user' at the bottom.)

(Sparc V9 is all different.)

Chris