Port-sparc64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[RESEND] User-level window trap when booting NetBSD kernel under QEMU SPARC64



Hi all,

I'm one of the QEMU SPARC/OpenBIOS maintainers and I've been spending my time over the past few weeks (and possibly longer!) working on patches so that NetBSD kernels will boot under QEMU SPARC64.

I've made some good progress recently, however I'm a still experiencing a user trap during boot which I don't understand. I've had some previous correspondence with Martin on this, but it requires a deep-level understanding as to how the SPARC64 memory management code works so I was hoping that you'd be able to provide some help with this.

So far I have a set of patches for OpenBIOS which get my 6.1.2 ISO image to boot to this point:


build@kentang:~/rel-qemu-git/bin$ ./qemu-system-sparc64 -cdrom /home/build/src/qemu/image/sparc64/NetBSD-6.1.2-sparc64.iso -bios /home/build/src/openbios/openbios-git/openbios-devel/obj-sparc64/openbios-builtin.elf.nostrip -boot d -nographic
OpenBIOS for Sparc64
Configuration device id QEMU version 1 machine id 0
kernel cmdline
CPUs: 1 x SUNW,UltraSPARC-IIi
UUID: 00000000-0000-0000-0000-000000000000
Welcome to OpenBIOS v1.1 built on May 12 2014 21:33
  Type 'help' for detailed information
Trying cdrom:f...
Not a bootable ELF image
Not a bootable a.out image

Loading FCode image...
Loaded 7478 bytes
entry point is 0x4000
NetBSD IEEE 1275 Multi-FS Bootblock
Version $NetBSD: bootblk.fth,v 1.13 2010/06/24 00:54:12 eeh Exp $
..
Jumping to entry point 0000000000100000 for type 0000000000000001...
switching to new context: entry point 0x100000 stack 0x00000000ffe8aa09
>> NetBSD/sparc64 OpenFirmware Boot, Revision 1.16
=0x8870a0
Loading netbsd: 8071888+553056+339856 [601032+393301]=0x9cd528
Unimplemented service set-symbol-lookup ([2] -- [0])

Unexpected client interface exception: -1
1 tt=30 tstate=4482000605 tpc=0x14984f4 tnpc=0x14984f8
2 tt=30 tstate=4411001503 tpc=0x1001804 tnpc=0x1001808
3 tt=c0 tstate=4482001604 tpc=0x10094f4 tnpc=0x135fbc8
Stopped in pid 0.1 (system) at  1008528:        nop
db{0}>


The problem is that I'm getting a data_access_exception on the first window fill trap executed after the kernel takes over the trap table with SUNW,set-trap-table.

Here is the gdb session showing the openfirmware() function after the NetBSD kernel has called SUNW,set-trap-table:


(gdb) disas 0x1009478, 0x10094f8
Dump of assembler code from 0x1009478 to 0x10094f8:
   0x0000000001009478:  sethi  %hi(0x1800000), %o4
   0x000000000100947c:  btst  1, %sp
   0x0000000001009480:  be  %icc, 0x10094f8
   0x0000000001009484:  ldx  [ %o4 ], %o4
   0x0000000001009488:  save  %sp, -176, %sp
   0x000000000100948c:  rdpr  %pil, %i2
   0x0000000001009490:  mov  0xf, %i3
   0x0000000001009494:  cmp  %i3, %i2
   0x0000000001009498:  movle  %icc, %i2, %i3
   0x000000000100949c:  wrpr  %g0, %i3, %pil
   0x00000000010094a0:  mov  %i0, %o0
   0x00000000010094a4:  mov  %g1, %l1
   0x00000000010094a8:  mov  %g2, %l2
   0x00000000010094ac:  mov  %g3, %l3
   0x00000000010094b0:  mov  %g4, %l4
   0x00000000010094b4:  mov  %g5, %l5
   0x00000000010094b8:  mov  %g6, %l6
   0x00000000010094bc:  mov  %g7, %l7
   0x00000000010094c0:  rdpr  %pstate, %l0
   0x00000000010094c4:  call  %i4
   0x00000000010094c8:  wrpr  6, %pstate
=> 0x00000000010094cc:  wrpr  %l0, %pstate
   0x00000000010094d0:  mov  %l1, %g1
   0x00000000010094d4:  mov  %l2, %g2
   0x00000000010094d8:  mov  %l3, %g3
   0x00000000010094dc:  mov  %l4, %g4
   0x00000000010094e0:  mov  %l5, %g5
   0x00000000010094e4:  mov  %l6, %g6
   0x00000000010094e8:  mov  %l7, %g7
   0x00000000010094ec:  wrpr  %i2, 0, %pil
   0x00000000010094f0:  ret
   0x00000000010094f4:  restore  %o0, %g0, %o0
End of assembler dump.
(gdb) info regi
g0             0x0      0
g1             0x1      1
g2             0x7e50000        132448256
g3             0x18d1c00        26024960
g4             0x1ae8000        28213248
g5             0x1000   4096
g6             0x0      0
g7             0x0      0
o0             0x0      0
o1             0x1      1
o2             0xfffffffffffffff8       -8
o3             0xffffffff00000000       -4294967296
o4             0x1c14230        29442608
o5             0x1000000        16777216
sp             0x1c054a1        0x1c054a1
o7             0x10094c4        16815300
l0             0x16     22
l1             0x1      1
l2             0x7e50000        132448256
l3             0x18d1c00        26024960
l4             0x1ae8000        28213248
l5             0x1000   4096
l6             0x0      0
l7             0x0      0
i0             0x1c05e00        29384192
i1             0x7e50000        132448256
i2             0xd      13
i3             0xf      15
i4             0xffd0fe60       4291886688
i5             0x18d1800        26023936
fp             0x1c05551        0x1c05551
i7             0x135fbc0        20315072
pc             0x10094cc        0x10094cc
npc            0x10094d0        0x10094d0
state          0x4482000604     294238815748
fsr            0x0      [ ]
fprs           0x4      [ FEF ]
y              0x0      0
cwp            0x4      4
pstate         0x6      [ IE PRIV ]
asi            0x82     130
ccr            0x44     68
(gdb)


The MMU TLB entries look like this:


QEMU 2.0.50 monitor - type 'help' for more information
(qemu) info tlb
MMU contexts: Primary: 0, Secondary: 0
DMMU dump
[00] VA: ffe00000, PA: 7f00000, 512k, priv, RW, locked, ctx 0 local
[01] VA: ffe80000, PA: 7f80000, 512k, priv, RW, locked, ctx 0 local
[02] VA: ffd00000, PA: 1fff0000000, 512k, priv, RO, locked, ctx 0 local
[03] VA: ffd80000, PA: 1fff0080000, 512k, priv, RO, locked, ctx 0 local
[04] VA: ffc80000, PA: 7e80000, 512k, priv, RW, locked, ctx 0 local
[05] VA: 4000, PA: 4000,   8k, priv, RW, unlocked, ctx 0 local
[06] VA: 6000, PA: 6000,   8k, priv, RW, unlocked, ctx 0 local
[07] VA: 8000, PA: 8000,   8k, priv, RW, unlocked, ctx 0 local
[08] VA: a000, PA: a000,   8k, priv, RW, unlocked, ctx 0 local
[09] VA: c000, PA: c000,   8k, priv, RW, unlocked, ctx 0 local
[10] VA: e000, PA: e000,   8k, priv, RW, unlocked, ctx 0 local
[11] VA: 10000, PA: 10000,   8k, priv, RW, unlocked, ctx 0 local
[12] VA: 12000, PA: 12000,   8k, priv, RW, unlocked, ctx 0 local
[13] VA: 14000, PA: 14000,   8k, priv, RW, unlocked, ctx 0 local
[14] VA: 16000, PA: 16000,   8k, priv, RW, unlocked, ctx 0 local
[15] VA: 18000, PA: 18000,   8k, priv, RW, unlocked, ctx 0 local
[16] VA: 1a000, PA: 1a000,   8k, priv, RW, unlocked, ctx 0 local
[17] VA: 100000, PA: 100000,   8k, priv, RW, unlocked, ctx 0 local
[18] VA: 102000, PA: 102000,   8k, priv, RW, unlocked, ctx 0 local
[19] VA: 104000, PA: 104000,   8k, priv, RW, unlocked, ctx 0 local
[20] VA: 106000, PA: 106000,   8k, priv, RW, unlocked, ctx 0 local
[21] VA: 108000, PA: 108000,   8k, priv, RW, unlocked, ctx 0 local
[22] VA: 10a000, PA: 10a000,   8k, priv, RW, unlocked, ctx 0 local
[23] VA: 10c000, PA: 10c000,   8k, priv, RW, unlocked, ctx 0 local
[24] VA: 10e000, PA: 10e000,   8k, priv, RW, unlocked, ctx 0 local
[25] VA: 110000, PA: 110000,   8k, priv, RW, unlocked, ctx 0 local
[26] VA: 112000, PA: 112000,   8k, priv, RW, unlocked, ctx 0 local
[27] VA: 114000, PA: 114000,   8k, priv, RW, unlocked, ctx 0 local
[28] VA: ffc7e000, PA: 7e7e000,   8k, priv, RW, unlocked, ctx 0 local
[29] VA: ffc7a000, PA: 7e7a000,   8k, priv, RW, unlocked, ctx 0 local
[30] VA: ffc7c000, PA: 7e7c000,   8k, priv, RW, unlocked, ctx 0 local
[31] VA: ffc78000, PA: 7e78000,   8k, priv, RW, unlocked, ctx 0 local
[32] VA: ffc76000, PA: 7e76000,   8k, priv, RW, unlocked, ctx 0 local
[33] VA: ffc72000, PA: 7e72000,   8k, priv, RW, unlocked, ctx 0 local
[34] VA: ffc70000, PA: 7e70000,   8k, priv, RW, unlocked, ctx 0 local
[35] VA: ffc6e000, PA: 7e6e000,   8k, priv, RW, unlocked, ctx 0 local
[36] VA: ffc64000, PA: 7e64000,   8k, priv, RW, unlocked, ctx 0 local
[37] VA: ffc66000, PA: 7e66000,   8k, priv, RW, unlocked, ctx 0 local
[38] VA: ffc68000, PA: 7e68000,   8k, priv, RW, unlocked, ctx 0 local
[39] VA: ffc6a000, PA: 7e6a000,   8k, priv, RW, unlocked, ctx 0 local
[40] VA: ffc6c000, PA: 7e6c000,   8k, priv, RW, unlocked, ctx 0 local
[41] VA: ffc62000, PA: 7e62000,   8k, priv, RW, unlocked, ctx 0 local
[42] VA: 1000000, PA: 7800000,   4M, priv, RO, locked, ctx 0 local
[43] VA: 1400000, PA: 7400000,   4M, priv, RO, locked, ctx 0 local
[44] VA: 1800000, PA: 7000000,   4M, priv, RW, locked, ctx 0 local
[45] VA: ffc60000, PA: 7e60000,   8k, priv, RW, unlocked, ctx 0 local
[46] VA: 7ffc000, PA: 7e5c000,   8k, priv, RW, unlocked, ctx 0 local
[47] VA: 7ffe000, PA: 7e5e000,   8k, priv, RW, unlocked, ctx 0 local
[48] VA: 7ffa000, PA: 7e5a000,   8k, priv, RW, unlocked, ctx 0 local
[49] VA: 1c0c000, PA: 7e40000,   8k, priv, RW, unlocked, ctx 0 local
[50] VA: 1c0e000, PA: 7e42000,   8k, priv, RW, unlocked, ctx 0 local
[51] VA: 1c10000, PA: 7e44000,   8k, priv, RW, unlocked, ctx 0 local
[52] VA: 1c12000, PA: 7e46000,   8k, priv, RW, unlocked, ctx 0 local
[53] VA: 1c14000, PA: 7e48000,   8k, priv, RW, unlocked, ctx 0 local
[54] VA: 1c16000, PA: 7e4a000,   8k, priv, RW, unlocked, ctx 0 local
[55] VA: 1c18000, PA: 7e4c000,   8k, priv, RW, unlocked, ctx 0 local
[56] VA: 1c1a000, PA: 7e4e000,   8k, priv, RW, unlocked, ctx 0 local
[57] VA: e0010000, PA: 7e40000,  64k, priv, RW, locked, ctx 0 local
[58] VA: 1c04000, PA: 14000,   8k, priv, RW, unlocked, ctx 0 local
IMMU dump
[00] VA: ffd00000, PA: 1fff0000000, 512k, priv, locked, ctx 0 local
[01] VA: ffc80000, PA: 7e80000, 512k, priv, locked, ctx 0 local
[02] VA: 100000, PA: 100000,   8k, priv, unlocked, ctx 0 local
[03] VA: 102000, PA: 102000,   8k, priv, unlocked, ctx 0 local
[04] VA: 10a000, PA: 10a000,   8k, priv, unlocked, ctx 0 local
[05] VA: 10c000, PA: 10c000,   8k, priv, unlocked, ctx 0 local
[06] VA: 110000, PA: 110000,   8k, priv, unlocked, ctx 0 local
[07] VA: 104000, PA: 104000,   8k, priv, unlocked, ctx 0 local
[08] VA: 108000, PA: 108000,   8k, priv, unlocked, ctx 0 local
[09] VA: 10e000, PA: 10e000,   8k, priv, unlocked, ctx 0 local
[10] VA: 106000, PA: 106000,   8k, priv, unlocked, ctx 0 local
[11] VA: 1000000, PA: 7800000,   4M, priv, locked, ctx 0 local
[12] VA: 1400000, PA: 7400000,   4M, priv, locked, ctx 0 local
(qemu)


As soon as I hit the restore at 0x10094f4 in gdb, I get a fill_0_normal trap which vectors to 0x1001800:


(gdb) disas 0x1001800, 0x100184c
Dump of assembler code from 0x1001800 to 0x100184c:
=> 0x0000000001001800:  wr  %g0, 0x11, %asi
   0x0000000001001804:  ldxa  [ %sp + 0x7ff ] %asi, %l0
   0x0000000001001808:  ldxa  [ %sp + 0x807 ] %asi, %l1
   0x000000000100180c:  ldxa  [ %sp + 0x80f ] %asi, %l2
   0x0000000001001810:  ldxa  [ %sp + 0x817 ] %asi, %l3
   0x0000000001001814:  ldxa  [ %sp + 0x81f ] %asi, %l4
   0x0000000001001818:  ldxa  [ %sp + 0x827 ] %asi, %l5
   0x000000000100181c:  ldxa  [ %sp + 0x82f ] %asi, %l6
   0x0000000001001820:  ldxa  [ %sp + 0x837 ] %asi, %l7
   0x0000000001001824:  ldxa  [ %sp + 0x83f ] %asi, %i0
   0x0000000001001828:  ldxa  [ %sp + 0x847 ] %asi, %i1
   0x000000000100182c:  ldxa  [ %sp + 0x84f ] %asi, %i2
   0x0000000001001830:  ldxa  [ %sp + 0x857 ] %asi, %i3
   0x0000000001001834:  ldxa  [ %sp + 0x85f ] %asi, %i4
   0x0000000001001838:  ldxa  [ %sp + 0x867 ] %asi, %i5
   0x000000000100183c:  ldxa  [ %sp + 0x86f ] %asi, %fp
   0x0000000001001840:  ldxa  [ %sp + 0x877 ] %asi, %i7
   0x0000000001001844:  restored
   0x0000000001001848:  retry
End of assembler dump.
(gdb) info regi
g0             0x0      0
g1             0x1f61ec8c2      8424179906
g2             0x1f60f8682      8423179906
g3             0xffe11df8       4292943352
g4             0x0      0
g5             0x0      0
g6             0x0      0
g7             0x0      0
o0             0x1c05e00        29384192
o1             0x7e50000        132448256
o2             0xd      13
o3             0xf      15
o4             0xffd0fe60       4291886688
o5             0x18d1800        26023936
sp             0x1c05551        0x1c05551
o7             0x135fbc0        20315072
l0             0xffffffffffe30c38       -1897416
l1             0xffe8ac38       4293438520
l2             0x17500f0        24445168
l3             0x1746c78        24407160
l4             0x1816400        25256960
l5             0x18c0800        25954304
l6             0x18c0800        25954304
l7             0x19cd570        27055472
i0             0xa      10
i1             0xffe8b0f0       4293439728
i2             0x20     32
i3             0xffd0fe60       4291886688
i4             0x17502f8        24445688
i5             0x0      0
fp             0xffe85219       0xffe85219
i7             0xffd0a988       4291864968
pc             0x1001800        0x1001800
npc            0x1001804        0x1001804
state          0x4482001503     294238819587
fsr            0x0      [ ]
fprs           0x4      [ FEF ]
y              0x0      0
cwp            0x3      3
pstate         0x15     [ AG PRIV PEF ]
asi            0x82     130
ccr            0x44     68
(gdb)


From here you can see that %sp is 0x1c05551, so the first access at %sp + 0x7ff bias = 0x1c05d50 which is mapped just before the call to SUNW,set-trap-table. But because the access is made using ASI 0x11 which is a user ASI then the fill_0_normal invokes a further data_access_exception trap, which takes roughly the following path:


-> trap 0x30, data_access_exception (0x1004600)
  -> winfault: 0x00000000010081cc
     http://nxr.netbsd.org/xref/src/sys/arch/sparc64/sparc64/locore.s#1721

     #1737 we did previously take a datafault, so go to winfixfill

  -> winfixfill: 0x000000000100822c
     http://nxr.netbsd.org/xref/src/sys/arch/sparc64/sparc64/locore.s#1756

     #1770: we are in PRIV mode, so carry on

     #1819: not at trap level 3, so invoke software trap 1 (0x101)

     Trap 0x101 invokes the panic/debugger


This shows that the 0x101 is being invoked deliberately because a kernel mapping is being accessed by a user ASI while the processor is in PSTATE.PRIV == 1 mode.

AFAICT the basic logic looks correct, so I am wondering if anyone can comment as to what should happen on real hardware? My current thoughts are that the initial fill_0_normal trap is incorrect, and instead a supervisor fill trap should be used instead but I can't quite understand how this is supposed to happen.

If anyone has any ideas as to why this is happening and/or what the intended behaviour is then I would be very interested to try and understand the memory management algorithms. And of course, when it all works then you get the warm feeling of being able to add a SPARC64 machine to your buildfarm!

If you've made it this far, then thank you for your time and I look forward to hearing from you further


ATB,

Mark.


Home | Main Index | Thread Index | Old Index