Subject: port-mips/16154: any user can hang the machine by masking SIGSEGV and faulting
To: None <gnats-bugs@gnats.netbsd.org>
From: None <manu@netbsd.org>
List: netbsd-bugs
Date: 04/01/2002 05:29:41
>Number:         16154
>Category:       port-mips
>Synopsis:       any user can hang the machine by masking SIGSEGV and faulting
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    port-mips-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Apr 01 05:30:01 PST 2002
>Closed-Date:
>Last-Modified:
>Originator:     Emmanuel Dreyfus
>Release:        NetBSD-current
>Organization:
The NetBSD Project
>Environment:
NetBSD plume 1.5ZC NetBSD 1.5ZC (IRIX3) #69: Mon Apr  1 13:28:17 CEST 2002     manu@plume:/cvs/src/sys/arch/sgimips/compile/IRIX3 sgimips
>Description:
When a program masks SIGSEGV and does a page fault, NetBSD/mips hangs. 
It is possible to drop into ddb and send a kill -9 to the offending
process, this will restore the machine to a fully functionnal state.
>How-To-Repeat:
#include <stdio.h>
#include <signal.h>

int main (void) {
        char *p = (char *)0xc;

        signal(SIGSEGV, SIG_IGN);

        printf("let's go\n");
        *p = *p + 1;
        printf("still alive?\n");
        return 0;
}
>Fix:
I have not yet fully spotted the problem. However, here are the 
information I gathered:

The normal behavior would be to loop on the fault: on *p access, we
get a page fault. There is no valid mapping at the requested address, 
hence we attempt to send a SIGSEGV. It is blocked, so we return to 
userland and restart the offending instruction. We fault again and we 
loop here forever.

This should not hang the machine since on return to userland, we can
schedule another process to run. The problem on mips ports is that 
the offending process is *always* re-scheduled to run.

More information: on the page fault, mips3_UserGenException is invoked,
from here, we have approximately this code path (determined by bloating
my kernel with printf's):
mips3_UserGenException
  trap
    uvm_fault
      uvmfault_lookup
    trapsignal
      psignal1
  ast
    preempt
      mi_switch
        ...

mi_switch always selects the offending process to run again, thus hanging
the machine. 
>Release-Note:
>Audit-Trail:
>Unformatted: