Port-sparc64 archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: [RESEND] User-level window trap when booting NetBSD kernel under QEMU SPARC64
On Mon, 19 May 2014, Mark Cave-Ayland wrote:
> Hi all,
>
> I'm one of the QEMU SPARC/OpenBIOS maintainers and I've been spending my time
> over the past few weeks (and possibly longer!) working on patches so that
> NetBSD kernels will boot under QEMU SPARC64.
>
> I've made some good progress recently, however I'm a still experiencing a user
> trap during boot which I don't understand. I've had some previous
> correspondence with Martin on this, but it requires a deep-level understanding
> as to how the SPARC64 memory management code works so I was hoping that you'd
> be able to provide some help with this.
>
> So far I have a set of patches for OpenBIOS which get my 6.1.2 ISO image to
> boot to this point:
>
>
> build@kentang:~/rel-qemu-git/bin$ ./qemu-system-sparc64 -cdrom
> /home/build/src/qemu/image/sparc64/NetBSD-6.1.2-sparc64.iso -bios
> /home/build/src/openbios/openbios-git/openbios-devel/obj-sparc64/openbios-builtin.elf.nostrip
> -boot d -nographic
> OpenBIOS for Sparc64
> Configuration device id QEMU version 1 machine id 0
> kernel cmdline
> CPUs: 1 x SUNW,UltraSPARC-IIi
> UUID: 00000000-0000-0000-0000-000000000000
> Welcome to OpenBIOS v1.1 built on May 12 2014 21:33
> Type 'help' for detailed information
> Trying cdrom:f...
> Not a bootable ELF image
> Not a bootable a.out image
>
> Loading FCode image...
> Loaded 7478 bytes
> entry point is 0x4000
> NetBSD IEEE 1275 Multi-FS Bootblock
> Version $NetBSD: bootblk.fth,v 1.13 2010/06/24 00:54:12 eeh Exp $
> ..
> Jumping to entry point 0000000000100000 for type 0000000000000001...
> switching to new context: entry point 0x100000 stack 0x00000000ffe8aa09
> >> NetBSD/sparc64 OpenFirmware Boot, Revision 1.16
> =0x8870a0
> Loading netbsd: 8071888+553056+339856 [601032+393301]=0x9cd528
> Unimplemented service set-symbol-lookup ([2] -- [0])
>
> Unexpected client interface exception: -1
> 1 tt=30 tstate=4482000605 tpc=0x14984f4 tnpc=0x14984f8
> 2 tt=30 tstate=4411001503 tpc=0x1001804 tnpc=0x1001808
> 3 tt=c0 tstate=4482001604 tpc=0x10094f4 tnpc=0x135fbc8
> Stopped in pid 0.1 (system) at 1008528: nop
> db{0}>
>
>
> The problem is that I'm getting a data_access_exception on the first window
> fill trap executed after the kernel takes over the trap table with
> SUNW,set-trap-table.
>
> Here is the gdb session showing the openfirmware() function after the NetBSD
> kernel has called SUNW,set-trap-table:
>
>
> (gdb) disas 0x1009478, 0x10094f8
> Dump of assembler code from 0x1009478 to 0x10094f8:
> 0x0000000001009478: sethi %hi(0x1800000), %o4
> 0x000000000100947c: btst 1, %sp
> 0x0000000001009480: be %icc, 0x10094f8
> 0x0000000001009484: ldx [ %o4 ], %o4
> 0x0000000001009488: save %sp, -176, %sp
> 0x000000000100948c: rdpr %pil, %i2
> 0x0000000001009490: mov 0xf, %i3
> 0x0000000001009494: cmp %i3, %i2
> 0x0000000001009498: movle %icc, %i2, %i3
> 0x000000000100949c: wrpr %g0, %i3, %pil
> 0x00000000010094a0: mov %i0, %o0
> 0x00000000010094a4: mov %g1, %l1
> 0x00000000010094a8: mov %g2, %l2
> 0x00000000010094ac: mov %g3, %l3
> 0x00000000010094b0: mov %g4, %l4
> 0x00000000010094b4: mov %g5, %l5
> 0x00000000010094b8: mov %g6, %l6
> 0x00000000010094bc: mov %g7, %l7
> 0x00000000010094c0: rdpr %pstate, %l0
> 0x00000000010094c4: call %i4
> 0x00000000010094c8: wrpr 6, %pstate
> => 0x00000000010094cc: wrpr %l0, %pstate
> 0x00000000010094d0: mov %l1, %g1
> 0x00000000010094d4: mov %l2, %g2
> 0x00000000010094d8: mov %l3, %g3
> 0x00000000010094dc: mov %l4, %g4
> 0x00000000010094e0: mov %l5, %g5
> 0x00000000010094e4: mov %l6, %g6
> 0x00000000010094e8: mov %l7, %g7
> 0x00000000010094ec: wrpr %i2, 0, %pil
> 0x00000000010094f0: ret
> 0x00000000010094f4: restore %o0, %g0, %o0
> End of assembler dump.
I'm not sure what we're looking at here. Is this kernel code or OpenBIOS
code? I assume the machine is OK at this point?
> (gdb) info regi
> g0 0x0 0
> g1 0x1 1
> g2 0x7e50000 132448256
> g3 0x18d1c00 26024960
> g4 0x1ae8000 28213248
> g5 0x1000 4096
> g6 0x0 0
> g7 0x0 0
> o0 0x0 0
> o1 0x1 1
> o2 0xfffffffffffffff8 -8
> o3 0xffffffff00000000 -4294967296
> o4 0x1c14230 29442608
> o5 0x1000000 16777216
> sp 0x1c054a1 0x1c054a1
> o7 0x10094c4 16815300
> l0 0x16 22
> l1 0x1 1
> l2 0x7e50000 132448256
> l3 0x18d1c00 26024960
> l4 0x1ae8000 28213248
> l5 0x1000 4096
> l6 0x0 0
> l7 0x0 0
> i0 0x1c05e00 29384192
> i1 0x7e50000 132448256
> i2 0xd 13
> i3 0xf 15
> i4 0xffd0fe60 4291886688
> i5 0x18d1800 26023936
> fp 0x1c05551 0x1c05551
> i7 0x135fbc0 20315072
> pc 0x10094cc 0x10094cc
> npc 0x10094d0 0x10094d0
> state 0x4482000604 294238815748
> fsr 0x0 [ ]
> fprs 0x4 [ FEF ]
> y 0x0 0
> cwp 0x4 4
> pstate 0x6 [ IE PRIV ]
> asi 0x82 130
> ccr 0x44 68
> (gdb)
>
>
> The MMU TLB entries look like this:
>
>
> QEMU 2.0.50 monitor - type 'help' for more information
> (qemu) info tlb
> MMU contexts: Primary: 0, Secondary: 0
> DMMU dump
> [00] VA: ffe00000, PA: 7f00000, 512k, priv, RW, locked, ctx 0 local
> [01] VA: ffe80000, PA: 7f80000, 512k, priv, RW, locked, ctx 0 local
> [02] VA: ffd00000, PA: 1fff0000000, 512k, priv, RO, locked, ctx 0 local
> [03] VA: ffd80000, PA: 1fff0080000, 512k, priv, RO, locked, ctx 0 local
> [04] VA: ffc80000, PA: 7e80000, 512k, priv, RW, locked, ctx 0 local
> [05] VA: 4000, PA: 4000, 8k, priv, RW, unlocked, ctx 0 local
> [06] VA: 6000, PA: 6000, 8k, priv, RW, unlocked, ctx 0 local
> [07] VA: 8000, PA: 8000, 8k, priv, RW, unlocked, ctx 0 local
> [08] VA: a000, PA: a000, 8k, priv, RW, unlocked, ctx 0 local
> [09] VA: c000, PA: c000, 8k, priv, RW, unlocked, ctx 0 local
> [10] VA: e000, PA: e000, 8k, priv, RW, unlocked, ctx 0 local
> [11] VA: 10000, PA: 10000, 8k, priv, RW, unlocked, ctx 0 local
> [12] VA: 12000, PA: 12000, 8k, priv, RW, unlocked, ctx 0 local
> [13] VA: 14000, PA: 14000, 8k, priv, RW, unlocked, ctx 0 local
> [14] VA: 16000, PA: 16000, 8k, priv, RW, unlocked, ctx 0 local
> [15] VA: 18000, PA: 18000, 8k, priv, RW, unlocked, ctx 0 local
> [16] VA: 1a000, PA: 1a000, 8k, priv, RW, unlocked, ctx 0 local
> [17] VA: 100000, PA: 100000, 8k, priv, RW, unlocked, ctx 0 local
> [18] VA: 102000, PA: 102000, 8k, priv, RW, unlocked, ctx 0 local
> [19] VA: 104000, PA: 104000, 8k, priv, RW, unlocked, ctx 0 local
> [20] VA: 106000, PA: 106000, 8k, priv, RW, unlocked, ctx 0 local
> [21] VA: 108000, PA: 108000, 8k, priv, RW, unlocked, ctx 0 local
> [22] VA: 10a000, PA: 10a000, 8k, priv, RW, unlocked, ctx 0 local
> [23] VA: 10c000, PA: 10c000, 8k, priv, RW, unlocked, ctx 0 local
> [24] VA: 10e000, PA: 10e000, 8k, priv, RW, unlocked, ctx 0 local
> [25] VA: 110000, PA: 110000, 8k, priv, RW, unlocked, ctx 0 local
> [26] VA: 112000, PA: 112000, 8k, priv, RW, unlocked, ctx 0 local
> [27] VA: 114000, PA: 114000, 8k, priv, RW, unlocked, ctx 0 local
> [28] VA: ffc7e000, PA: 7e7e000, 8k, priv, RW, unlocked, ctx 0 local
> [29] VA: ffc7a000, PA: 7e7a000, 8k, priv, RW, unlocked, ctx 0 local
> [30] VA: ffc7c000, PA: 7e7c000, 8k, priv, RW, unlocked, ctx 0 local
> [31] VA: ffc78000, PA: 7e78000, 8k, priv, RW, unlocked, ctx 0 local
> [32] VA: ffc76000, PA: 7e76000, 8k, priv, RW, unlocked, ctx 0 local
> [33] VA: ffc72000, PA: 7e72000, 8k, priv, RW, unlocked, ctx 0 local
> [34] VA: ffc70000, PA: 7e70000, 8k, priv, RW, unlocked, ctx 0 local
> [35] VA: ffc6e000, PA: 7e6e000, 8k, priv, RW, unlocked, ctx 0 local
> [36] VA: ffc64000, PA: 7e64000, 8k, priv, RW, unlocked, ctx 0 local
> [37] VA: ffc66000, PA: 7e66000, 8k, priv, RW, unlocked, ctx 0 local
> [38] VA: ffc68000, PA: 7e68000, 8k, priv, RW, unlocked, ctx 0 local
> [39] VA: ffc6a000, PA: 7e6a000, 8k, priv, RW, unlocked, ctx 0 local
> [40] VA: ffc6c000, PA: 7e6c000, 8k, priv, RW, unlocked, ctx 0 local
> [41] VA: ffc62000, PA: 7e62000, 8k, priv, RW, unlocked, ctx 0 local
> [42] VA: 1000000, PA: 7800000, 4M, priv, RO, locked, ctx 0 local
> [43] VA: 1400000, PA: 7400000, 4M, priv, RO, locked, ctx 0 local
> [44] VA: 1800000, PA: 7000000, 4M, priv, RW, locked, ctx 0 local
> [45] VA: ffc60000, PA: 7e60000, 8k, priv, RW, unlocked, ctx 0 local
> [46] VA: 7ffc000, PA: 7e5c000, 8k, priv, RW, unlocked, ctx 0 local
> [47] VA: 7ffe000, PA: 7e5e000, 8k, priv, RW, unlocked, ctx 0 local
> [48] VA: 7ffa000, PA: 7e5a000, 8k, priv, RW, unlocked, ctx 0 local
> [49] VA: 1c0c000, PA: 7e40000, 8k, priv, RW, unlocked, ctx 0 local
> [50] VA: 1c0e000, PA: 7e42000, 8k, priv, RW, unlocked, ctx 0 local
> [51] VA: 1c10000, PA: 7e44000, 8k, priv, RW, unlocked, ctx 0 local
> [52] VA: 1c12000, PA: 7e46000, 8k, priv, RW, unlocked, ctx 0 local
> [53] VA: 1c14000, PA: 7e48000, 8k, priv, RW, unlocked, ctx 0 local
> [54] VA: 1c16000, PA: 7e4a000, 8k, priv, RW, unlocked, ctx 0 local
> [55] VA: 1c18000, PA: 7e4c000, 8k, priv, RW, unlocked, ctx 0 local
> [56] VA: 1c1a000, PA: 7e4e000, 8k, priv, RW, unlocked, ctx 0 local
> [57] VA: e0010000, PA: 7e40000, 64k, priv, RW, locked, ctx 0 local
> [58] VA: 1c04000, PA: 14000, 8k, priv, RW, unlocked, ctx 0 local
> IMMU dump
> [00] VA: ffd00000, PA: 1fff0000000, 512k, priv, locked, ctx 0 local
> [01] VA: ffc80000, PA: 7e80000, 512k, priv, locked, ctx 0 local
> [02] VA: 100000, PA: 100000, 8k, priv, unlocked, ctx 0 local
> [03] VA: 102000, PA: 102000, 8k, priv, unlocked, ctx 0 local
> [04] VA: 10a000, PA: 10a000, 8k, priv, unlocked, ctx 0 local
> [05] VA: 10c000, PA: 10c000, 8k, priv, unlocked, ctx 0 local
> [06] VA: 110000, PA: 110000, 8k, priv, unlocked, ctx 0 local
> [07] VA: 104000, PA: 104000, 8k, priv, unlocked, ctx 0 local
> [08] VA: 108000, PA: 108000, 8k, priv, unlocked, ctx 0 local
> [09] VA: 10e000, PA: 10e000, 8k, priv, unlocked, ctx 0 local
> [10] VA: 106000, PA: 106000, 8k, priv, unlocked, ctx 0 local
> [11] VA: 1000000, PA: 7800000, 4M, priv, locked, ctx 0 local
> [12] VA: 1400000, PA: 7400000, 4M, priv, locked, ctx 0 local
> (qemu)
>
>
> As soon as I hit the restore at 0x10094f4 in gdb, I get a fill_0_normal trap
> which vectors to 0x1001800:
So the fault happens on the last instruction of the previous routine? And
this is *after* the call to SUNW,set-trap-table? This means you should be
running with the kernel's trap table, right?
>
>
> (gdb) disas 0x1001800, 0x100184c
> Dump of assembler code from 0x1001800 to 0x100184c:
> => 0x0000000001001800: wr %g0, 0x11, %asi
> 0x0000000001001804: ldxa [ %sp + 0x7ff ] %asi, %l0
> 0x0000000001001808: ldxa [ %sp + 0x807 ] %asi, %l1
> 0x000000000100180c: ldxa [ %sp + 0x80f ] %asi, %l2
> 0x0000000001001810: ldxa [ %sp + 0x817 ] %asi, %l3
> 0x0000000001001814: ldxa [ %sp + 0x81f ] %asi, %l4
> 0x0000000001001818: ldxa [ %sp + 0x827 ] %asi, %l5
> 0x000000000100181c: ldxa [ %sp + 0x82f ] %asi, %l6
> 0x0000000001001820: ldxa [ %sp + 0x837 ] %asi, %l7
> 0x0000000001001824: ldxa [ %sp + 0x83f ] %asi, %i0
> 0x0000000001001828: ldxa [ %sp + 0x847 ] %asi, %i1
> 0x000000000100182c: ldxa [ %sp + 0x84f ] %asi, %i2
> 0x0000000001001830: ldxa [ %sp + 0x857 ] %asi, %i3
> 0x0000000001001834: ldxa [ %sp + 0x85f ] %asi, %i4
> 0x0000000001001838: ldxa [ %sp + 0x867 ] %asi, %i5
> 0x000000000100183c: ldxa [ %sp + 0x86f ] %asi, %fp
> 0x0000000001001840: ldxa [ %sp + 0x877 ] %asi, %i7
> 0x0000000001001844: restored
> 0x0000000001001848: retry
> End of assembler dump.
So this is presumably fill_0_normal?
> (gdb) info regi
> g0 0x0 0
> g1 0x1f61ec8c2 8424179906
> g2 0x1f60f8682 8423179906
> g3 0xffe11df8 4292943352
> g4 0x0 0
> g5 0x0 0
> g6 0x0 0
> g7 0x0 0
> o0 0x1c05e00 29384192
> o1 0x7e50000 132448256
> o2 0xd 13
> o3 0xf 15
> o4 0xffd0fe60 4291886688
> o5 0x18d1800 26023936
> sp 0x1c05551 0x1c05551
> o7 0x135fbc0 20315072
> l0 0xffffffffffe30c38 -1897416
> l1 0xffe8ac38 4293438520
> l2 0x17500f0 24445168
> l3 0x1746c78 24407160
> l4 0x1816400 25256960
> l5 0x18c0800 25954304
> l6 0x18c0800 25954304
> l7 0x19cd570 27055472
> i0 0xa 10
> i1 0xffe8b0f0 4293439728
> i2 0x20 32
> i3 0xffd0fe60 4291886688
> i4 0x17502f8 24445688
> i5 0x0 0
> fp 0xffe85219 0xffe85219
> i7 0xffd0a988 4291864968
> pc 0x1001800 0x1001800
> npc 0x1001804 0x1001804
> state 0x4482001503 294238819587
> fsr 0x0 [ ]
> fprs 0x4 [ FEF ]
> y 0x0 0
> cwp 0x3 3
> pstate 0x15 [ AG PRIV PEF ]
> asi 0x82 130
> ccr 0x44 68
> (gdb)
You should learn how to use ddb. It has lots of nifty MD commands to dump
supervisor state registers, such as the trap stack.
>
>
> From here you can see that %sp is 0x1c05551, so the first access at %sp +
> 0x7ff bias = 0x1c05d50 which is mapped just before the call to
> SUNW,set-trap-table. But because the access is made using ASI 0x11 which is a
> user ASI then the fill_0_normal invokes a further data_access_exception trap,
> which takes roughly the following path:
>
>
> -> trap 0x30, data_access_exception (0x1004600)
> -> winfault: 0x00000000010081cc
> http://nxr.netbsd.org/xref/src/sys/arch/sparc64/sparc64/locore.s#1721
>
> #1737 we did previously take a datafault, so go to winfixfill
>
> -> winfixfill: 0x000000000100822c
> http://nxr.netbsd.org/xref/src/sys/arch/sparc64/sparc64/locore.s#1756
>
> #1770: we are in PRIV mode, so carry on
>
> #1819: not at trap level 3, so invoke software trap 1 (0x101)
>
> Trap 0x101 invokes the panic/debugger
>
>
> This shows that the 0x101 is being invoked deliberately because a kernel
> mapping is being accessed by a user ASI while the processor is in PSTATE.PRIV
> == 1 mode.
>
> AFAICT the basic logic looks correct, so I am wondering if anyone can comment
> as to what should happen on real hardware? My current thoughts are that the
> initial fill_0_normal trap is incorrect, and instead a supervisor fill trap
> should be used instead but I can't quite understand how this is supposed to
> happen.
>
> If anyone has any ideas as to why this is happening and/or what the intended
> behaviour is then I would be very interested to try and understand the memory
> management algorithms. And of course, when it all works then you get the warm
> feeling of being able to add a SPARC64 machine to your buildfarm!
>
> If you've made it this far, then thank you for your time and I look forward to
> hearing from you further
It's been a while since I last looked at the SPARC V9 manual, but ISTR
%wstate register controls which of the window fill/spill traps is taken
for regular and "other" states.
You need to dump the contents of %wstate.
There are 16 window trap vectors for each operation in both the normal and
nucleus trap tables. I think both trap tables should be pretty much the
same.
The first 8 are for "normal" traps. These are called when %otherwin is 0.
This occurs when all the windows are from the same address space, either
kernel or userland.
The second 8 are for "other" traps. When a process traps from userland to
the kernel, the kernel sets %otherwin to the number of userland
stackframes. Every time the kernel spills a frame, if %otherwin is not
zero the CPU calls one of the "other" traps, and then decrements
%otherwin. But you probably don't care about this right now.
Of those are trap vectors:
0 is used for 32-bit userland stackframes
1 is used for 64-bit userland stackframes
2 will check the stack alignment and call one of the above
routines.
4 is used for 32-bit kernel stackframes
5 is used for 64-bit kernel stackframes
6 will check the stack alignment and call one of the above
routines.
When running user mode we set %wstate to 022, which means it will call
fill_2_normal and fill_2_other. When running in kernel mode we set
%wstate to 066 which means it will call fill_6_normal and fill_6_other.
The sun4u code in locore.s does this:
/* sun4u */
set _C_LABEL(trapbase), %l1
call _C_LABEL(prom_set_trap_table_sun4u) ! Now we should be
running 100% from our handlers
mov %l1, %o0
7:
wrpr %l1, 0, %tba ! Make sure the PROM
didn't foul up.
/*
* Switch to the kernel mode and run away.
*/
wrpr %g0, WSTATE_KERN, %wstate
So right after installing the trap table it sets the %wstate register to
use trap vector 6 so it will use fill_6_normal.
I'm not entirely sure what's going on here since you didn't have symbols
in the disassembly and there's no stack trace, but I assume the routine
generating the fault is openfirmware() in the kernel.
My guess is either QUEMU is ignoring the contents of the %wstate register,
or OpenBIOS is changing the contents of the %wstate register and not
restoring it before returning to the kernel.
Edaurdo
Home |
Main Index |
Thread Index |
Old Index