netbsd-users: Panics galore, featuring 1.5 & hp300

Subject: Panics galore, featuring 1.5 & hp300
To: None <port-hp300@netbsd.org, netbsd-users@netbsd.org>
From: Jarkko Teppo <jarkko.teppo@er-grp.com>
List: netbsd-users
Date: 03/09/2001 09:06:27
Hello all,
I'm having a really bad first quarter with 1.5 on hp300. First a couple of
'minor' problems: Having softdeps in the kernel and trying to mount a cd-rom
with:
# mount -t cd9660 /dev/sd3c /cdrom
Normally results in a panic: (this is leads to interesting problems if
the hp300 just happens to be your main system at home, and the only one
with a cd-rom *and* all your sources and stuff are on CDs)

<snip>
sd3 at oscsi0 targ4 lun 0: <MATHSITA, CD-ROM CR-8008, 8.0e) (SCSI-2)
sd3: CD-ROM, 324375 blocks, 2048 bytes/block
<snip>
# mount -t cd9660 /dev/sd3c /mnt
panic: MMU fault
Stopped in mount_cd9660 at _cpu_Debugger+0x6: unlk a6
db> t
_cpu_debugger()
_panic()
_trap()
_worklist_remove()
_softdep_disk_write_complete()
_biodone()
_sdfinish()
_sdstart()
_sdustart()
_sdstrategy()
_sdgetcapacity()
_sdgetinfo()
_sdopen()
_spec_open()
_iso_mountfs()
_cd9660_mount()
_syscall()
_trap0()

After finally getting the sources from from the CDs (I was rescued by NextStep)
and getting rid of softdep in the kernel I now run in to these problems.
These *always* happen when a CD is mounted and practically never when a cd
is not mounted. Sorry, no ps from ddb as I was in the middle of building
teTeX and I had >100 procs running.

I'll keep 1.5 on the HP for a week in case I could help with something
but after that I'll be going back to 1.4.x.

Now here's some stuff from DDB; You decide, kernel bug or borken hw ?

First bad, reclen=402a, DIRSIZ=12, namlen=0, flags=5100, entry offsetinblock=0
/: bad dir ino 492928 at offset 0: mangled entry
panic: bad dir
stopped in find at _cpu_debugger+0x6: unlk a6
db> t
_cpu_Debugger(0,0,9,1c2f400,49d3d3c) + 6
_panic(10a0ab,2eaf000,49d3df8,109980,49a62e0) + 60
_ufs_dirbad(49a62e0,0,109665,49a5e9c,0) + 3c
_ufs_lookup(49d3e38) + 2f8
_lookup(49d3f00) + 224
_namei(49d3f00) + 2dc
_change_dir(49d3f00,49051b0) + 12
_sys_chdir(49051b0,49d3f88,49d3f80) + 2e
_syscall(c) + 110
_trap0() + e

db> show vnode
OBJECT 0x124823: locked=78, pgops=0x4e750000, npages=6370816, refs=1630863457

VNODE flags 643200(XWANT,DIROP)
nio 6369792 size 0x61310061 wlistuvm_fault(0x166c54, 0x30006000, 0, 0x1) -> 0x1
 type 8, code [mmu, ssw]: 525
trap type 8, code = 0x525, v = 0x30006437
kernel program counter = 0x14c918
kernel: MMU fault trap
 Caught exception in ddb.
db> continue
syncing disks... panic: lockmgr: locking against myself
Stopped in find at _cpu_Debugger+0x6: unlk a6
db> t
_cpu_Debugger(10412,0,10012,49a5f38,49d3bf8) + 6
_panic(6e591,10012,2,0,1cb7b00) + 60
_lockmgr(49a5f38,10012,49a5f36,49d3c2c,9cf3c) + 460
_genfs_lock(49d3c20) + 18
_vn_lock(49a5e9c,10012,48a8dcc,49a5e9c,49d3c8c) + 66
_vget(49a5e9c,10012) + b0
_ffs_sync(1c2f400,2,1cb7b00,49051b0,1c2f400) + 7a
_sys_sync(49051b0,0,0,9717c,49d3d38) + 7a
_vfs_shutdown(100,49d3d2c,7d3f0,100,0) + 40
_cpu_reboot()
_panic()
previous trace here...

.
.
.
Automatic boot in progress: starting file system checks.
/dev/raid0c: parity Re-write complete
/dev/rsd2a: DIRECTORY CORRUPTED I=484102 OWNER=root MODE=40755
/dev/rsd2a: SIZE=512 MTIME=Dec 18 04:58 2000
DIR=?

/dev/rsd2a: UNEXPECTED INCONSISTENCY; RUN fcsk_ffs MANUALLY.
.

I'm going to clone /dev/rsd2a to /dev/it so in the future I can just type
# fsck it

The "Softdeps in kernel, mounting CDs leads to a panic" is very
repeatable, the bad dir -panic happens once in a while.

Thanks for all the help (in advance),
-- 
jht