Subject: Re: 2100 hangs
To: None <port-pmax@netbsd.org>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: port-pmax
Date: 05/01/2001 14:30:59
I wrote

>> I've got a 2100 [which] hangs every time I try to do a make build.

I have a little more information, in case it says anything to anyone.

In the interests of debugging, I mounted the disk sync and then instead
of make build, did ktrace -i make build, in the hope that the ktrace
would point towards what was failing and the sync would ensure that the
ktrace log made it to disk.

This time it didn't hang.  Instead, well, here's what I see in the
terminal window that I left running a "less +F" on the logfile:

cleandir ===> usr.sbin/ipf/ipsend
Waiting for data... (interrupt to abort)trap: TLB miss (load or instr. fetch) in
 kernel mode
status=0x100a2000, cause=0x108, epc=0x7ffdf428, vaddr=0x7ffdf428
pid=3682 cmd=make usp=0x7ffdf428 ksp=0xc2521f68
Stopped in make at      0x7ffdf428:trap: TLB miss (load or instr. fetch) in kern
el mode
status=0x10022000, cause=0x30002108, epc=0x801a0780, vaddr=0x7ffdf428
pid=3682 cmd=make usp=0x7ffdf428 ksp=0xc2521e20
Stopped in make at      db_disasm+0x10:
        lw      a0,0(v0)
db> 

The first epc (7ffdf428) value is not inside the kernel; I don't know
how come it's "in kernel mode" then.  The second epc value is
db_disasm+0x10, as the "Stopped in" line implies.

Apparently MNT_SYNC did part of its job, but not all; while fsck is
happy with the filesystem upon rebooting, there is a discrepancy
between the make build logfile and the ktrace.out log: the former
indicates that it got through ipf/ipmon, ipf/ipnat, ipf/ipresend, and
fell over in ipf/ipsend, whereas looking for chdir calls in the latter
makes it appear that it ends with ipf/ipnat.

Interestingly, doing a "mount -o async /dev/rz1a /mnt2" reliably
triggers a very similar panic:


# sync   
# mount -o async /dev/rz1a /mnt2
trap: TLB miss (store) in kernel mode
status=0x8fc34, cause=0xb000010c, epc=0x801d6a88, vaddr=0x4270a9bc
pid=36 cmd=mount_ffs usp=0x7ffff940 ksp=0xc2513880
Stopped in mount_ffs at memset+0x48:    bne     a0,t0,<memset+44>       [addr:0
x801d6a84]
                bdslot: memset+0x4c:    sw      t1,-4(a0)
db> tr
memset+48 (4270a9c0,0,1,c2513da0) ra 80170ac0 sz 0
ffs_mount+4a0 (4270a9c0,7ffffac1,1,c2513da0) ra 800b4a7c sz 1240
sys_mount+56c (4270a9c0,7ffffac1,1,c2513da0) ra 801a5e50 sz 288
syscall+218 (4270a9c0,7ffffac1,1,c2513da0) ra 800311a8 sz 96
mips1_SystemCall+d0 (4270a9c0,7ffffac1,1,c2513da0) ra 4012c0 sz 0
PC 0x4012c0: not in kernel space
_DYNAMIC_LINK+4012c0 (4270a9c0,7ffffac1,1,c2513da0) ra 0 sz 0
User-level: pid 36
db> 

I haven't yet investigated further, and unfortunately I don't have a
stack trace from the earlier TLB-miss panic (like an idiot, I rebooted
before I thought of it).  I'm going to see if that's reproducible
enough to get one.

					der Mouse

			       mouse@rodents.montreal.qc.ca
		     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B