Re: mostly working SMP again

To: matthew green <mrg%eterna.com.au@localhost>
Subject: Re: mostly working SMP again
From: BERTRAND Joel <joel.bertrand%systella.fr@localhost>
Date: Wed, 13 Jan 2010 13:12:37 +0100

matthew green a écrit :

    BERTRAND Joel a écrit :
    >  matthew green a écrit :
    >>  yeah - there seems to be some problem(s) remaining, and hypersparc
    >>  seems to fail much worse.
    >>
    >>  we're still investigating. thanks for testing!
    >>
    >
    >  I'm doing some tests with dual SM71 to see if this bug is Hypersparc
    >  specific.

        It's not Hypersparc specific. With two SM71-1, kernel hangs and opens
    internal debugger :

    CPU1: data fault: pc=0xf000ab3c addr=0x80

this is probably the savefpstate() crash i've been looking into.
can you see what addr 0xf000ab3c is in your kernel?


Here is gdb's output :

(gdb) disassemble 0xf000ab00 0xf000b000
Dump of assembler code from 0xf000ab00 to 0xf000b000:
0xf000ab00 <Lkcopy_done+4>:     retl
0xf000ab04 <Lkcopy_done+8>:     clr  %o0
0xf000ab08 <Lkcopy_done+12>:    stb  %o4, [ %o1 ]
0xf000ab0c <Lkcopy_done+16>:    st  %g1, [ %o5 + 0xc ]
0xf000ab10 <Lkcopy_done+20>:    retl
0xf000ab14 <Lkcopy_done+24>:    clr  %o0
0xf000ab18 <Lkcerr+0>:  retl
0xf000ab1c <Lkcerr+4>:  st  %g1, [ %o5 + 0xc ]
0xf000ab20 <savefpstate+0>:     rd  %psr, %o1
0xf000ab24 <savefpstate+4>:     sethi  %hi(0x1000), %o2
0xf000ab28 <savefpstate+8>:     or  %o1, %o2, %o1
0xf000ab2c <savefpstate+12>:    mov  %o1, %psr
0xf000ab30 <savefpstate+16>:    sethi  %hi(0x2000), %o5
0xf000ab34 <savefpstate+20>:    mov  %g0, %o3
0xf000ab38 <savefpstate+24>:    nop
0xf000ab3c <special_fp_store+0>:        st  %fsr, [ %o0 + 0x80 ]
0xf000ab40 <special_fp_store+4>:        ld  [ %o0 + 0x80 ], %o4
0xf000ab44 <special_fp_store+8>:        btst  %o5, %o4
0xf000ab48 <special_fp_store+12>:       bne  0xf000ab94 <Lfp_storeq>
0xf000ab4c <special_fp_store+16>:       std  %f0, [ %o0 ]
0xf000ab50 <Lfp_finish+0>:      st  %o3, [ %o0 + 0x84 ]
0xf000ab54 <Lfp_finish+4>:      std  %f2, [ %o0 + 8 ]
0xf000ab58 <Lfp_finish+8>:      std  %f4, [ %o0 + 0x10 ]
0xf000ab5c <Lfp_finish+12>:     std  %f6, [ %o0 + 0x18 ]
0xf000ab60 <Lfp_finish+16>:     std  %f8, [ %o0 + 0x20 ]
0xf000ab64 <Lfp_finish+20>:     std  %f10, [ %o0 + 0x28 ]
0xf000ab68 <Lfp_finish+24>:     std  %f12, [ %o0 + 0x30 ]
0xf000ab6c <Lfp_finish+28>:     std  %f14, [ %o0 + 0x38 ]
0xf000ab70 <Lfp_finish+32>:     std  %f16, [ %o0 + 0x40 ]
0xf000ab74 <Lfp_finish+36>:     std  %f18, [ %o0 + 0x48 ]
0xf000ab78 <Lfp_finish+40>:     std  %f20, [ %o0 + 0x50 ]
0xf000ab7c <Lfp_finish+44>:     std  %f22, [ %o0 + 0x58 ]
0xf000ab80 <Lfp_finish+48>:     std  %f24, [ %o0 + 0x60 ]
0xf000ab84 <Lfp_finish+52>:     std  %f26, [ %o0 + 0x68 ]
0xf000ab88 <Lfp_finish+56>:     std  %f28, [ %o0 + 0x70 ]
0xf000ab8c <Lfp_finish+60>:     retl

This code comes from sparc/locore.s, line 5874 :

special_fp_store:
    st  %fsr, [%o0 + FS_FSR]    ! f->fs_fsr = getfsr();
    /*
     * Even if the preceding instruction did not trap, the queue
     * is not necessarily empty: this state save might be happening
     * because user code tried to store %fsr and took the FPU
     * from `exception pending' mode to `exception' mode.
     * So we still have to check the blasted QNE bit.
     * With any luck it will usually not be set.
     */
    ld  [%o0 + FS_FSR], %o4 ! if (f->fs_fsr & QNE)
    btst    %o5, %o4
    bnz Lfp_storeq      !   goto storeq;
     std    %f0, [%o0 + FS_REGS + (4*0)]    ! f->fs_f0 = etc;

I can read 'Even if the preceding instruction did not trap', thus Isuppose that trap is not caught in MP kernel.


        Regards,

        JKB

Follow-Ups:
- re: mostly working SMP again
  - From: matthew green

References:
- re: mostly working SMP again
  - From: matthew green

Prev by Date: re: mostly working SMP again
Next by Date: Re: anyone uses floppy on sun4m?
Previous by Thread: re: mostly working SMP again
Next by Thread: re: mostly working SMP again
Indexes:

Home | Main Index | Thread Index | Old Index