Port-i386 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[patches 1/2] ddb recursive panic fix



Hi all,

Here is the first round of patches to address the ddb recursive
panic().  (For details about the recursive panic problem, see
<http://mail-index.netbsd.org/port-i386/2014/01/09/msg003212.html>.)
These patches should apply cleanly to NetBSD-current.

Summary of the patches:

  1. i386: fix a comment (cpu_switch() -> cpu_switchto())
  2. ddb: add missing newline to printf() message
  3. i386: remove vestige from old call to printk()
  4. sleep 10 seconds before rebooting due to panic()
  5. i386: stop ddb backtrace at Xsoftintr()
  6. i386: use strcmp("Xsoftintr") instead of strncmp("Xsoft", 5)
  7. i386: comment about missing stackframe member initialization

Patch #5 is the important one; the rest are cleanups for minor issues
I encountered while troubleshooting the problem.

Below are the patches with the commit messages I used in our Git
repository.  I basically pasted the output of 'git log -p --reverse'
into this email.  Ignore the dates; they're out of order because I
used 'git rebase' extensively to make these commits more presentable.

I'd appreciate any feedback.

Thanks,
Richard

======================================================================

commit 8803240b5c16eb7cc2bd9e852a8650c49069c112
Author: Richard Hansen <rhansen%bbn.com@localhost>
Date:   Thu Oct 3 10:00:18 2013 +0000

    i386: fix a comment (cpu_switch() -> cpu_switchto())

diff --git a/src/sys/arch/i386/i386/locore.S b/src/sys/arch/i386/i386/locore.S
index 4bc496f..fcc9d08 100644
--- a/src/sys/arch/i386/i386/locore.S
+++ b/src/sys/arch/i386/i386/locore.S
@@ -881,8 +881,9 @@ END(lgdt_finish)
  *
  * This is a trampoline function pushed onto the stack of a newly created
  * process in order to do some additional setup.  The trampoline is entered by
- * cpu_switch()ing to the process, so we abuse the callee-saved registers used
- * by cpu_switch() to store the information about the stub to call.
+ * cpu_switchto()ing to the process, so we abuse the callee-saved
+ * registers used by cpu_switchto() to store the information about the
+ * stub to call.
  * NOTE: This function does not have a normal calling sequence!
  */
 NENTRY(lwp_trampoline)

commit cb405a9c41c6da0ca5e5c4d7e2ed63a3e16a311e
Author: Richard Hansen <rhansen%bbn.com@localhost>
Date:   Wed Oct 2 21:27:34 2013 +0000

    ddb: add missing newline to printf() message

diff --git a/src/sys/ddb/db_panic.c b/src/sys/ddb/db_panic.c
index 365a69e..4647e67 100644
--- a/src/sys/ddb/db_panic.c
+++ b/src/sys/ddb/db_panic.c
@@ -57,7 +57,7 @@ void db_panic(void)
                            cpu_index(curcpu()));
                        intrace = 0;
                } else
-                       printf("Faulted in mid-traceback; aborting...");
+                       printf("Faulted in mid-traceback; aborting...\n");
                if (db_onpanic == 2)
                        Debugger();
        }

commit 2dbba0f8626c7a858b718b4000de22b0230af0ba
Author: Richard Hansen <rhansen%bbn.com@localhost>
Date:   Wed Oct 2 22:28:55 2013 +0000

    i386: remove vestige from old call to printk()

    This line was introduced with a call to printk() in CVS revision
    1.22.4.3 and should have been removed when the call to printk() was
    removed in CVS revision 1.22.4.6.

diff --git a/src/sys/arch/i386/i386/spl.S b/src/sys/arch/i386/i386/spl.S
index 2f00d2c..c6f5b9f 100644
--- a/src/sys/arch/i386/i386/spl.S
+++ b/src/sys/arch/i386/i386/spl.S
@@ -226,7 +226,6 @@ IDTVEC(spllower)
        ret
 #if defined(DEBUG)
 .Lspllower_panic:
-       addl $8, %esp
        pushl   $1f
        call    _C_LABEL(panic)
 1:     .asciz  "SPLLOWER: INTERRUPT ENABLED"

commit 45399f7e6330389e7364bcbc3d10969234350170
Author: Richard Hansen <rhansen%bbn.com@localhost>
Date:   Wed Oct 2 21:18:14 2013 +0000

    sleep 10 seconds before rebooting due to panic()

    To give an opportunity to screencap a panic(), or pause a VM to attach
    a debugger.

diff --git a/src/sys/kern/subr_prf.c b/src/sys/kern/subr_prf.c
index 4d2839b..de2ea5b 100644
--- a/src/sys/kern/subr_prf.c
+++ b/src/sys/kern/subr_prf.c
@@ -281,6 +281,19 @@ vpanic(const char *fmt, va_list ap)
 #ifdef DDB
        db_panic();
 #endif
+       printf("rebooting in");
+       for (int i = 10; i >= 0; --i) {
+               printf(" %u", i);
+               /*
+                * sleep 10ms 100 times instead of 1s once to give a
+                * VM hypervisor an opportunity to redraw part of the
+                * screen during each call
+                */
+               for (int j=0; i && j < 100; ++j) {
+                       DELAY(10000);
+               }
+       }
+       printf("\n");
        cpu_reboot(bootopt, NULL);
 }


commit 12784bf2870ca2bf67486ea95fbfa861d511ddba
Author: Richard Hansen <rhansen%bbn.com@localhost>
Date:   Thu Oct 3 07:45:08 2013 +0000

    i386: stop ddb backtrace at Xsoftintr()

    Stop unwinding frames when db_stack_trace_print() encouters
    Xsoftintr().  This avoids a recursive panic() due to an invalid
    pointer dereference when a software interrupt panic()s.

    Here's what happens without this change:

    When db_stack_trace_print() runs during a panic() and db_nextframe()
    encounters the Xsoftintr() frame, db_nextframe() does the following at
    db_machdep.c:292:

      1. checks to see if there's a Xsoftintr() symbol (there is)
      2. checks to see if the frame corresponds to an interrupt (the
         symbol name begins with "Xsoft" so it does)

    If both of the above are true (they are), db_nextframe() at
    db_machdep.c:303 tries to get a pointer to a struct intrframe.
    According to the comment at line 300, the second argument passed to
    Xsoftintr() is a pointer to a struct intrframe.  However, the comment
    and the corresponding code are not correct -- Xsoftintr() doesn't take
    any arguments[1].  Attempting to fetch the second argument only yields
    stack garbage, not a struct intrframe.  This causes db_machdep.c:307
    to dereference a bad pointer, triggering the recursive panic().

    [1] Xsoftintr() is called by Xspllower() which is called by splx()
        a.k.a. spllower().  Neither Xspllower() nor Xsoftintr() set up a
        standard frame when called (they don't do 'pushl %ebp; movl %esp,
        %ebp'), so Xsoftintr()'s %ebp is the same as splx()'s %ebp.  This
        makes splx()'s arguments look like Xsoftintr()'s arguments, and
        splx() does not take any arguments.

    You can reproduce the recursive panic by reverting this change and
    adding a call to panic() inside ipintr().  The backtrace will look
    like the following (the line numbers you see might differ from these
    line numbers -- this backtrace was generated from a slightly modified
    version of the NetBSD 6.1 kernel):

        #0  vpanic (fmt=0xc0ba995b "trap", ap=0xdaa51730) at 
/usr/src/sys/kern/subr_prf.c:211
        #1  0xc0790529 in panic (fmt=0xc0ba995b "trap") at 
/usr/src/sys/kern/subr_prf.c:205
        #2  0xc07decbc in trap (frame=0xdaa517c0) at 
/usr/src/sys/arch/i386/i386/trap.c:396
        #3  0xc010cf48 in ?? () at /usr/src/sys/arch/i386/i386/vector.S:983
        #4  0xc02857f0 in db_get_value (addr=56, size=4, is_signed=false) at 
/usr/src/sys/ddb/db_access.c:72
        #5  0xc028a09a in db_nextframe (nextframe=0xdaa51b40, 
retaddr=0xdaa51b3c, arg0=0xdaa51b38, ip=0xdaa51b34, argp=0xdaa51d88, is_trap=0, 
pr=0xc07901b5 <printf>) at /usr/src/sys/arch/i386/i386/db_machdep.c:308
        #6  0xc028be2b in db_stack_trace_print (addr=<optimized out>, 
have_addr=true, count=65533, modif=0xc0bb44bf "", pr=0xc07901b5 <printf>) at 
/usr/src/sys/arch/x86/x86/db_trace.c:275
        #7  0xc07903cb in vpanic (fmt=0xc0b6ba76 "testing", ap=0xdaa51d4c) at 
/usr/src/sys/kern/subr_prf.c:296
        #8  0xc0790529 in panic (fmt=0xc0b6ba76 "testing") at 
/usr/src/sys/kern/subr_prf.c:205
        #9  0xc04e3d4f in ipintr () at /usr/src/sys/netinet/ip_input.c:369
        #10 0xc054ac0d in softint_execute (s=<optimized out>, si=<optimized 
out>, l=<optimized out>) at /usr/src/sys/kern/kern_softint.c:543
        #11 softint_dispatch (pinned=0xc4085560, s=4) at 
/usr/src/sys/kern/kern_softint.c:825
        #12 0xc0100fdb in ?? () at /usr/src/sys/arch/i386/i386/spl.S:390
        #13 0xc07d2e11 in tcp_usrreq (so=0xc40b0534, req=4, m=0x0, 
nam=0xc317ba00, control=0x0, l=0xc4085560) at 
/usr/src/sys/netinet/tcp_usrreq.c:615
        #14 0xc04bb300 in tcp_usrreq_wrapper (a=0xc40b0534, b=4, c=0x0, 
d=0xc317ba00, e=0x0, f=0xc4085560) at /usr/src/sys/netinet/in_proto.c:164
        #15 0xc0839006 in soconnect (so=0xc40b0534, nam=0xc317ba00, 
l=0xc4085560) at /usr/src/sys/kern/uipc_socket.c:821
        #16 0xc083c4ce in do_sys_connect (l=0xc4085560, fd=4, nam=0xc317ba00) 
at /usr/src/sys/kern/uipc_syscalls.c:371
        #17 0xc083dbeb in sys_connect (l=0xc4085560, uap=0xdbc27d00, 
retval=0xdbc27d28) at /usr/src/sys/kern/uipc_syscalls.c:350
        #18 0xc07b1b4a in sy_call (rval=0xdbc27d28, uap=0xdbc27d00, 
l=0xc4085560, sy=0xc0c2f018) at /usr/src/sys/sys/syscallvar.h:61
        #19 syscall (frame=0xdbc27d48) at 
/usr/src/sys/arch/x86/x86/syscall.c:179
        #20 0xc010056d in ?? () at /usr/src/sys/arch/i386/i386/locore.S:1160
        Backtrace stopped: previous frame inner to this frame (corrupt stack?)

diff --git a/src/sys/arch/i386/i386/db_machdep.c 
b/src/sys/arch/i386/i386/db_machdep.c
index 1296d47..5b469be 100644
--- a/src/sys/arch/i386/i386/db_machdep.c
+++ b/src/sys/arch/i386/i386/db_machdep.c
@@ -147,10 +147,12 @@ db_frame_info(long *frame, db_addr_t callpc, const char 
**namep, db_expr_t *offp
                            !strncmp(name, "Xstray", 6) ||
                            !strncmp(name, "Xhold", 5) ||
                            !strncmp(name, "Xrecurse", 8) ||
-                           !strcmp(name, "Xdoreti") ||
-                           !strncmp(name, "Xsoft", 5)) {
+                           !strcmp(name, "Xdoreti")) {
                                *is_trap = INTERRUPT;
                                narg = 0;
+                       } else if (!strncmp(name, "Xsoft", 5)) {
+                               *is_trap = SOFTINTR;
+                               narg = 0;
                        } else if (!strncmp(name, "Xtss_", 5)) {
                                *is_trap = INTERRUPT_TSS;
                                narg = 0;
@@ -172,10 +174,12 @@ db_frame_info(long *frame, db_addr_t callpc, const char 
**namep, db_expr_t *offp
                            !strncmp(name, "_Xstray", 7) ||
                            !strncmp(name, "_Xhold", 6) ||
                            !strncmp(name, "_Xrecurse", 9) ||
-                           !strcmp(name, "_Xdoreti") ||
-                           !strncmp(name, "_Xsoft", 6)) {
+                           !strcmp(name, "_Xdoreti")) {
                                *is_trap = INTERRUPT;
                                narg = 0;
+                       } else if (!strncmp(name, "_Xsoft", 6)) {
+                               *is_trap = SOFTINTR;
+                               narg = 0;
                        } else if (!strncmp(name, "_Xtss_", 6)) {
                                *is_trap = INTERRUPT_TSS;
                                narg = 0;
@@ -252,6 +256,7 @@ db_nextframe(
            case TRAP:
            case SYSCALL:
            case INTERRUPT:
+           case SOFTINTR:
            default:
                /* The only argument to trap() or syscall() is the trapframe. */
                switch (is_trap) {
@@ -272,6 +277,11 @@ db_nextframe(
                         */
                        db_read_bytes((db_addr_t)argp, sizeof(tf), (char *)&tf);
                        break;
+               case SOFTINTR:
+                       (*pr)("--- softint ---\n");
+                       tf.tf_eip = 0;
+                       tf.tf_ebp = 0;
+                       break;
                }
                *ip = (db_addr_t)tf.tf_eip;
                fp = (struct i386_frame *)tf.tf_ebp;
diff --git a/src/sys/arch/x86/include/db_machdep.h 
b/src/sys/arch/x86/include/db_machdep.h
index a81947b..d302fc1 100644
--- a/src/sys/arch/x86/include/db_machdep.h
+++ b/src/sys/arch/x86/include/db_machdep.h
@@ -11,6 +11,7 @@
 #define INTERRUPT      3
 #define INTERRUPT_TSS  4
 #define TRAP_TSS       5
+#define SOFTINTR       6

 #define MAXNARG                16


commit d6cebe5db568231341e4be9acc7af7357d5bac6d
Author: Richard Hansen <rhansen%bbn.com@localhost>
Date:   Thu Oct 3 07:45:08 2013 +0000

    i386: use strcmp("Xsoftintr") instead of strncmp("Xsoft", 5)

    Change the strncmp(name, "Xsoft", 5) call to strcmp(name, "Xsoftintr")
    for the following reasons:
      * readability:  There are no other symbols beginning with "Xsoft",
        so the strncmp() is a bit confusing.
      * safety:  It is less surprising to assume that a new function
        beginning with "Xsoft" is an ordinary function.  There is already
        an expectation that a new software interrupt handler function will
        require special care, so requiring a change to the condition
        statement when adding a new softint handler does not seem overly
        burdensome.

diff --git a/src/sys/arch/i386/i386/db_machdep.c 
b/src/sys/arch/i386/i386/db_machdep.c
index 5b469be..230ad32 100644
--- a/src/sys/arch/i386/i386/db_machdep.c
+++ b/src/sys/arch/i386/i386/db_machdep.c
@@ -150,7 +150,7 @@ db_frame_info(long *frame, db_addr_t callpc, const char 
**namep, db_expr_t *offp
                            !strcmp(name, "Xdoreti")) {
                                *is_trap = INTERRUPT;
                                narg = 0;
-                       } else if (!strncmp(name, "Xsoft", 5)) {
+                       } else if (!strcmp(name, "Xsoftintr")) {
                                *is_trap = SOFTINTR;
                                narg = 0;
                        } else if (!strncmp(name, "Xtss_", 5)) {
@@ -177,7 +177,7 @@ db_frame_info(long *frame, db_addr_t callpc, const char 
**namep, db_expr_t *offp
                            !strcmp(name, "_Xdoreti")) {
                                *is_trap = INTERRUPT;
                                narg = 0;
-                       } else if (!strncmp(name, "_Xsoft", 6)) {
+                       } else if (!strcmp(name, "_Xsoftintr")) {
                                *is_trap = SOFTINTR;
                                narg = 0;
                        } else if (!strncmp(name, "_Xtss_", 6)) {

commit 7e6dfe8371f9021ceb147010e6549f4aa5de3e6a
Author: Richard Hansen <rhansen%bbn.com@localhost>
Date:   Mon Dec 2 19:21:05 2013 -0500

    i386: comment about missing stackframe member initialization

    I haven't studied the code, but I'm concerned that not initializing
    sf->sf_edi could potentially leak a few bytes of information to a new
    userspace process.

diff --git a/src/sys/arch/x86/x86/vm_machdep.c 
b/src/sys/arch/x86/x86/vm_machdep.c
index 3b8dbf4..90e29de 100644
--- a/src/sys/arch/x86/x86/vm_machdep.c
+++ b/src/sys/arch/x86/x86/vm_machdep.c
@@ -228,6 +228,11 @@ cpu_lwp_fork(struct lwp *l1, struct lwp *l2, void *stack, 
size_t stacksize,
        pcb2->pcb_rsp = (uint64_t)sf;
        pcb2->pcb_rbp = (uint64_t)l2;
 #else
+       /*
+        * XXX Is there a reason sf->sf_edi isn't initialized here?
+        * Could this leak potentially sensitive information to new
+        * userspace processes?
+        */
        sf->sf_esi = (int)func;
        sf->sf_ebx = (int)arg;
        sf->sf_eip = (int)lwp_trampoline;



Home | Main Index | Thread Index | Old Index