Subject: Re: my first sparc64 panic :)
To: Manuel Bouyer <bouyer@antioche.lip6.fr>
From: Eduardo Horvath <eeh@turbolinux.com>
List: port-sparc
Date: 08/24/2000 10:45:44
On Thu, 24 Aug 2000, Manuel Bouyer wrote:

> No problems, I can get one in a few seconds when trying to make the tcsh
> package (in fact, it seems to happen at fork/exec time).
> ===> Extracting for tcsh-6.09.00
> trap type 0x34: pc=f12698ac npc=f1269884 pstate=ffffffff98580006<PRIV,IE>
> kernel trap 34: mem address not aligned
> Stopped in sh at        pmap_enter_pv+0x19c:    ldx             [%o3 + 0x8], %o0
> 
> db> tr
> pmap_enter(f19e8120, 27e000, 11d56000, 2, f149fcf0, 0) at pmap_enter+0x358
> uvm_fault(f1933880, 4, f149a8f0, f64382b0, 3, 278000) at uvm_fault+0xf78
> data_access_fault(6c, 27e13a, 10a688, f6445ed0, 0, 27e13a) at data_access_fault+
> 0x488
> Ldatafault_internal(2868d0, 286310, 0, 0, 0, 0) at Ldatafault_internal+0xe0
> db> 
> 
> 
> > 
> > 1) `mach tf' to get the trapframe of the fault.
> 
> db> mach tf
> Trapframe 0xf146e9c0:   tstate: 0x9858000603    pc: 0xf12698ac  npc: 0xf1269884
> y: 0    pil: 7  oldpil: 7       fault: 0x9858000603     kstack: 0x0     tt: 34  G
> lobals:
> 0000000000000000 0000000011d4c000 0000000000000000 00000000f149af08
> 0000000000000000 0000000000000000 0000000000000021 0000000000000000
> outs:
> 0000000300000006 ffffffffffffe000 0000000000008eab 0000000400000005

Hm.  400000005 is clearly junk.

> fffffffffffffff1 00000000f1000000 00000000f6445151 00000000f1267708
> locals:
> 00000000f142dd68 00000000f1a5a008 00000000f142dc00 0000000000000000
> 00000000f141e2c4 00000000f12dba20 00000000f149a770 000000000027e000
> ins:
> 00000000f19e8120 000000000027e000 0000000011d56000 0000000000000000
> 0000000011d509e0 00000000000009f8 00000000f6445211 00000000f1267288
> db> 
> 
> > 
> > 2) check curproc's p_vmstate to make sure it has a correct pmap pointer.
> 
> I assume you mean p_vmspace. Is this what we get from VMSPACE with 
> "show all proc /a" ?
> 
> db> show all proc /a
>  PID          COMMAND      STRUCT PROC *            UAREA *     VMSPACE/VM_MAP
>  321               sh         0xf642e290         0xf6454000         0xf5e469b0
>  320               sh         0xf642e510         0xf644c000         0xf5e47930
>  319               sh         0xf642e010         0xf6446000         0xf5e47170
> >318               sh         0xf642e790         0xf6442000         0xf5e47740
>  317               sh         0xf642ec90         0xf643e000         0xf5e46f80
>  310             make         0xf642ea10         0xf643a000         0xf5e47b20
>  309               sh         0xf5e3db80         0xf6434000         0xf5e47360
>  292             make         0xf5e3cf00         0xf6430000         0xf5e46d90
>  291               sh         0xf5e3d900         0xf6428000         0xf5e47550
>  194             make         0xf5e3cc80         0xf6420000         0xf5e463e0
>  178              csh         0xf5e3ca00         0xf5e62000         0xf5e461f0
>  176             cron         0xf5e3d680         0xf6416000         0xf5e46ba0
>  173            inetd         0xf5e3d400         0xf6412000         0xf5e465d0
>  98           syslogd         0xf5e3d180         0xf63fc000         0xf5e467c0
>  4            ioflush         0xf5e3c780         0xf5e54000         0xf1472648
>  3             reaper         0xf5e3c500         0xf5e50000         0xf1472648
>  2         pagedaemon         0xf5e3c280         0xf5e4c000         0xf1472648
>  1               init         0xf5e3c000         0xf5e38000         0xf5e46000
>  0            swapper         0xf1472848         0xf1802000         0xf1472648
> 
> I assume curproc is PID 318
> 
> db> show map /f 0xf5e47740
> MAP 0xf5e47740: [0x0->0xf1000000]
>         #ent=7, sz=269565952, ref=1, version=5, flags=0x1
>         pmap=0xf19e8120(resident=10)

The pmap pointer seems valid in this case.

>  - 0xf641ebb0: 0x100000->0x180000: obj=0xf5e49740/0x0, amap=0x0/0
>         submap=F, cow=T, nc=T, prot(max)=5/7, inh=1, wc=0, adv=0
>  - 0xf641efd0: 0x200000->0x280000: obj=0xf5e49740/0x0, amap=0xf64382b0/0
>         submap=F, cow=T, nc=F, prot(max)=3/7, inh=1, wc=0, adv=0
>  - 0xf641f9f0: 0x280000->0x288000: obj=0x0/0x0, amap=0xf5e5f9d0/0
>         submap=F, cow=T, nc=F, prot(max)=3/7, inh=1, wc=0, adv=0
>  - 0xf641f510: 0x288000->0x292000: obj=0x0/0x0, amap=0xf5e5fc00/0
>         submap=F, cow=T, nc=F, prot(max)=7/7, inh=1, wc=0, adv=0
>  - 0xf641ec10: 0x10200000->0x10202000: obj=0x0/0x0, amap=0xf64388d0/0
>         submap=F, cow=T, nc=F, prot(max)=3/7, inh=1, wc=0, adv=0
>  - 0xf641fbd0: 0xe1000000->0xf0f80000: obj=0x0/0x0, amap=0x0/0
>         submap=F, cow=T, nc=T, prot(max)=0/7, inh=1, wc=0, adv=0
>  - 0xf641e8b0: 0xf0f80000->0xf1000000: obj=0x0/0x0, amap=0xf5e5fce0/0
>         submap=F, cow=T, nc=F, prot(max)=7/7, inh=1, wc=0, adv=0
> db> 
> 
> > 
> > 3) if you can figure out the address of the original fault (possibly from
> > `mach tf /u') you can use `mach pv <page>' to dump the pv_list for that
> > page.
> 
> db> mach tf /u
> Trapframe 0xf6445ed0:   tstate: 0x800008206     pc: 0x10a688    npc: 0x10a68c
> y: 0    pil: 0  oldpil: 0       fault: 0x27e13a kstack: 0x0     tt: 6c  Globals:
> 
> 0000000000000000 0000000000000002 0000000000000010 000000000017d86f
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> outs:
> 00000000002868d0 0000000000286310 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 00000000f0ffeaa1 0000000000000000
> locals:
> 000000000027e978 000000000028e000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> ins:
> 0000000000286800 00000000002868d0 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 00000000f0ffeb61 000000000010a76c
> 
> Would the address be pc or npc ?
> I don't know sparc64 well enouth for this.

This is a little complicated.  The faulting address (or what the locore.s
thinks is the faulting address) is in the `fault:' field, in this case
0x27e13a.  This is probably a userland VA.  Hm.  It was a protection
fault (0x6c).  I think the only way to get the page is to take the va
(0x27e13a), dump the submap that contains it, and see if there's a page
for it.

Eduardo Horvath