Subject: Problems building SML/NJ for i386 & sparc
To: None <port-sparc@NetBSD.ORG, port-i386@NetBSD.ORG>
From: William O Ferry <WOFerry+@CMU.EDU>
List: port-sparc
Date: 01/28/1998 14:31:44
    For a class I'm taking this semester I want to install SML/NJ
version 110 on my NetBSD/i386 and NetBSD/sparc boxes.  The build routine
(which is rather evil IMHO) supported the "x86-netbsd", and built
everything just fine.  It doesn't support sparc-netbsd, though it does
support other sparc systems (i.e. SunOS, Solaris, and NeXTstep it
seems).  I'm running into problems with the machine-dependent part of
the code, namely what I believe is the overflow exception handler.  FWIW
my sparc is a Classic, so it's a microSPARC processor.

    A question I had to start deals with a comment they have in the
x86-netbsd part of their code.  They say the following:

/* NetBSD (including versions 1.0 and 1.1) generates SIGBUS rather
   than SIGFPE for overflows.  The real fix is a trivial change to
   kernel sources, which has already been reported (NetBSD internal
   problem identification "port-i386/1833"). 

   If you want to fix this on your NetBSD system.  Edit machdep.c in
   directory /sys/arch/i386/i386, and find the line

        setgate(&idt[  4], &IDTVEC(ofl),     0, SDT_SYS386TGT, SEL_KPL);

   Change SEL_KPL to SEL_UPL.  With SEL_KPL, the int overflow trap is
   not accessible at user level, and a protection fault occurs instead
   (thus the seg fault).  SEL_UPL will allow user processes to generate
   this trap.

   For the change to take effect, recompile your kernel, install it
   and reboot. */

I see that this PR was just recently marked closed, and was reported to
be fixed a while ago.  Either way, this would only affect the i386,
correct?  If somebody can tell me around when this was actually fixed I
could add the correct version checks in the code.  The SML interpreter
is of the kind that uses itself to do most of the building, and it built
just fine on the i386 with the code testing both SIGFPE and SIGBUS. 
Here's the actual defines it used:

#    define SIG_FAULT1        SIGFPE
#    define SIG_FAULT2        SIGBUS
#    define INT_DIVZERO(s, c)    0
#    define INT_OVFLW(s, c)    (((s) == SIGFPE) || ((s) == SIGBUS))

    Since it built okay I'd imagine it will run okay.  But if there are
corrections that should be made to these above defines, again I'll take
note of them.  I figured I might as well make a NetBSD-style port of
this while I'm installing it.

    The real problems I'm having right now deal with the SIGFPE handling
on the sparc, as well as some sys/* header defines.  From what I can
make out of the machine/* header files and such, this sounds about right
for sparc-netbsd (it also happens to be exactly the defines for SunOS. 
Solaris's defines are quite different):

#    define SIG_FAULT1        SIGFPE
#    define INT_DIVZERO(s, c)    (((s) == SIGFPE) && ((c) == FPE_INTDIV_TRAP))
#    define INT_OVFLW(s, c)    (((s) == SIGFPE) && ((c) == FPE_INTOVF_TRAP))

#    define SIG_GetCode(info, scp)    (info)
#    define SIG_GetPC(scp)    ((scp)->sc_pc)
#    define SIG_SetPC(scp, addr)    {            \
    (scp)->sc_pc = (long)(addr);                \
    (scp)->sc_npc = (scp)->sc_pc + 4;            \
    }

    Hopefully from the identifiers used it's semi-clear what each one is
supposed to do, and what it's doing.  Does anything look wrong with
this?  When it gets to the overflow condition the kdump seems to show a
SIGFPE, yet the program then has a bus error and dumps core.  If I use
the defines from x86-netbsd instead, it doesn't bus error but rather
causes SML to complain about an unknown request and it dies cleanly. 
Either way, the program does print out "T_INTOF" right before it dies,
this message seems to be coming from the kernel.  Are there problems
with this signal?

    There were also several defines the code expected to find elsewhere,
namely WINDOWSIZE, stack alignment routines, and the values for ST_DIV0
and ST_INT_OVERFLOW.  I got the first two from Solaris's trap.h, and set
WINDOWSIZE to (16*4).  The last two were defined in Solaris as 0x02 and
0x07, but appeared to be either 0x82 and 0x87 or 0x02 and 0x07.  I tried
either, neither one seemed to help.

    I know very little about the internal signals of the sparc or
NetBSD/sparc, and was hoping somebody who knows more might be able to
steer me in the correct direction on this.  I'd greatly appreciate any
suggestions.  I can probably provide an account on my sparc, or
certainly at least the package as I have it so far, if it would help.

    Thanks in advance.

                                                          Will Ferry

-----------------------------------------------------------------------
 William O Ferry  <woferry@CMU.EDU> | finger: woferry@Warp.RES.CMU.EDU
 http://light.res.cmu.edu/~woferry/ | talk:   finger for online status
-----------------------------------------------------------------------