Port-sparc64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: [RESEND] User-level window trap when booting NetBSD kernel under QEMU SPARC64



On 20/05/14 00:38, Eduardo Horvath wrote:

Yes, that's correct. Unfortunately I don't have a NetBSD build environment so
I've been doing most of the work by disassembling the kernel via QEMU's
gdbstub and comparing against the source in a web browser. So based upon what
you're saying it appears we have a stack like this (indented to show window
saves/restores):

   cpu_initialize() {
     prom_set_trap_table() {
       openfirmware() {
           /* OpenBIOS C code */
           of_client_interface() {
              enter_forth() {
                set_trap_table() {
                   SUNW,set-trap-table
                }
              }
           }
       }
       /* fill_0_normal trap occurs here */
     }

     /* Switch to kernel mode */
     wrpr %g0, WSTATE_KERN, %wstate
   }

The assumption has to be that in order for this to work without errors then no
window/fault traps can occur between calling SUNW,set-trap-table in OpenBIOS
and getting back to cpu_initialize() to set the correct value for %wstate
which is quite a few window levels. AIUI data faults can't happen because the
ASI is set to 0x82 (no fault) which is why it is the fill_0_normal window
fault which is triggering this.

I'm starting to wonder if setting %wstate to use trap vector 6 should happen
*before* calling prom_set_trap_table()? At the point SUNW,set-trap-table is
called then the kernel is effectively saying "I am taking responsibility for
handling all traps from now on", and so if the kernel cannot handle traps
after this point for any reason, then it is not honouring its contract to
manage the trap table.

The problem comes down to setting %wstate and callng SUNW,set-trap-table
really needs to be an atomic operation, which is not really possible.

I'm not sure I understand why this needs to be atomic? Surely at this point the CPU is only running a single thread which at this point must be in privileged/supervisor mode?

Regardless of this, now I understand this further I need to look into the
OpenBIOS CIF interface in order to see if I can preserve the entire window
state across CIF calls which I suspect might be what Sun's OBP does. Otherwise
it would not be possible to run many versions of NetBSD (and OpenBSD which
suffers from the same problem) under emulation :/

Yes, OBP preserves the window state on entry and restores it on exit.  It
also written in Forth, so doesn't use any register windows internally.
It issues one `save' instruction on entry to get a set of registers to use
for the Forth engine and one `restore' when the CIF returns.

Indeed, OBP jumps into Forth at quite a low level compared to OpenBIOS which has quite a bit of C infrastructure bolted on for various bits and pieces. When you say "preserves window state", do you mean just the current window or the complete set of NWINDOWS?

If OpenBIOS uses register windows, things will get really complicated.
You enter the CIF handler with the PROM's register window settings, but
return with the kernel's settings.

Looking at locore.s:

1:

         /* set trap table */
#ifdef SUN4V
         cmp     %l6, CPU_SUN4V
         bne,pt  %icc, 6f
          nop
         /* sun4v */
         set     _C_LABEL(trapbase_sun4v), %o0
         GET_MMFSA %o1
         call    _C_LABEL(prom_set_trap_table_sun4v)     ! Now we should be
running 100% from our handlers
          nop

         ba      7f
          nop
6:
#endif
         /* sun4u */
         set     _C_LABEL(trapbase), %l1
         call    _C_LABEL(prom_set_trap_table_sun4u)     ! Now we should be
running 100% from our handlers
          mov    %l1, %o0
7:
         wrpr    %l1, 0, %tba                    ! Make sure the PROM
didn't foul up.

         /*
          * Switch to the kernel mode and run away.
          */
         wrpr    %g0, WSTATE_KERN, %wstate


So the kernel calls prom_set_trap_table_sun4u() and then immediately sets
the %tba address itself.  You could try moving that call to after the
register window state has been set up so you call
prom_set_trap_table_sun4u() with the kernel trap table and the kernel
window state.  I don't know what either OpenBIOS or OBP would do in that
situation.

Having thought about it more, I think that the arguably "correct" solution as to how to transition from PROM to kernel is to move the setting of %wstate to WSTATE_KERN beforehand like this:

          /* sun4u */
          wrpr    %g0, WSTATE_KERN, %wstate

          set     _C_LABEL(trapbase), %l1
call _C_LABEL(prom_set_trap_table_sun4u) ! Now we should be
 running 100% from our handlers
           mov    %l1, %o0
 7:
          wrpr    %l1, 0, %tba                    ! Make sure the PROM
 didn't foul up.


Otherwise you're depending on an unreliable assumption that your entire execution path for calling prom_set_trap_table_sun4u() won't window fault between these two points for any given PROM and any particular NWINDOWS. And to repeat my question from earlier, given how early on we are in the boot process then how could the CPU be in anything but privileged/supervisor mode at this point?

Both OBP and OpenBIOS will handle window traps at all levels, which thinking about it realistically is the only way that a PROM could function since there is nothing a PROM can do to detect %wstate before the window trap is taken. You can confirm this yourself from looking at the source code below:

OpenBoot:
http://code.coreboot.org/p/openboot/source/tree/1/obp/arch/sun4u/traptable.fth

OpenBIOS:
http://code.coreboot.org/p/openbios/source/tree/HEAD/trunk/openbios-devel/arch/sparc64/vectors.S

Otherwise, you can just skip the prom_set_trap_table_sun4u() call.  I
added the SUNW,set-trap-table call to make sure OBP was aware the trap
table changed in case it needed to do something funny.

Interesting. You can actually check the OpenBoot source to see what OBP gets up to (and sadly it's nothing particularly exciting) here: http://code.coreboot.org/p/openboot/source/tree/HEAD/trunk/obp/arch/sun4v/forthint.fth. The main thing it does is clear any pending soft interrupts and stop the L14 timer.

Amusingly the particular bug I'm seeing is actually caused by the introduction of prom_set_trap_table_sun4u(), since beforehand %tba was set first immediately followed by %wstate in the same window, so there was absolutely no chance that a window fault could appear at this point! I think that with the introduction of prom_set_trap_table_sun4u() you managed to get lucky that this didn't affect real hardware ;)

Option 2 is to rewrite the OpenBIOS to have SUNW,set-trap-table just
execute the load of the %trapbase register and not do any other fancy stuff.

I've tried moved the setting of the %tba register to the outermost CIF interface routine and I still get the window fault, so it looks as if the only option remaining is...

Option 3 is to save and restore the entire window state on CIF entry and
exit.  This will be a pain in the neck because you not only need to save
all the regiser window control registers, you also need to cycle through
and save or restore any dirty register windows contents.  Or I suppose you
could always save all the contents of all the register windows.  This will
slow down CIF entry and exit.

With a bit of work, I now have a solution based upon this approach that works with both NetBSD and OpenBSD on all my test images. Sadly as you correctly point out, the prologue and epilogue of the CIF entry into OpenBIOS are now a lot heavier but I personally believe there is greater value in having a slightly slower CIF implementation and being able to run more virtualised OSs for testing under emulation in this instance.


Kind regards,

Mark.



Home | Main Index | Thread Index | Old Index