Subject: Re: Alpha DS10 Hanging on Generic 1.5.3 kernel
To: None <port-alpha@netbsd.org>
From: Johan A. van Zanten <johan@ewranglers.com>
List: port-alpha
Date: 08/06/2002 06:14:45
OK, good news and bad news:

Good News:

  I've installed the latest firmware for the Compaq DS10 and it seemed to
take care of the USB-related kernel problems.  I got it from here:

ftp://ftp.digital.com/pub/DEC/Alpha/firmware/readmes/ds10.html

I installed it using the BOOTP method described in the webpage cited
above. This took the SRM firmware on the machine ("sarasvati") from 5.7 to
6.2.  I did this after reading the following on the FreeBSD 4.6 Alpha
hardware notes:

  The USB ports are not supported and are disabled by the SRM console in
   all recent SRM versions.

This is the FreeBSD document where the above is written:
ftp://ftp6.FreeBSD.org/pub/FreeBSD/releases/alpha/4.6-RELEASE/HARDWARE.HTM#AEN898

If the latest version of SRM defaults to disabling built-in USB, it
suggests that they are non-functional.

 After the firmware upgrade, the generic NetBSD 1.5.3 kernel booted
without any problems. It did not report the existence of any USB devices
or buses.  Sarasvati (the DS10 in question) came up multi-user, no problem.

Bad News:

 sarasvati still panics when attempting to build a new kernel on an
NFS-mounted file system.

So, i then attempted to get a little more info about the panic.

1) ftp'd syssrc.tgz to sarasvati, put it on local disk, and compiled a
GENERIC kernel with symbols. (The Only change to the config was
uncommenting 'makeoptions DEBUG="-g"'.)

2) config'd, built, installed new kernel and rebooted.  (Gotta love that
Alpha speed!)


3) NFS mounted kernel source.  config'ing a new kernel configuration file
 ran to completion, then ran "make depend."

4) Panic arrived when doing the "make".

Crash and boot verbage attached below for the curious. This panic still
seems to be associated with NFS, but different than the previous. I'm
using the built-in Ethernet interfaces in the machine.

  Juergen Weiss said he has seen this problem with pc164 board and nfs,
and maybe be related to the Tulip. Should i open a new PR about this?  Is
there anything else i can get out of the kernel debugger? The crash is
reproducable.

 -johan


Crash:

cc  -O2 -Werror -Wall -Wstrict-prototypes -Wmissing-prototypes  -Wpointer-arith -Wno-uninitialized -Wno-main -mno-fp-regs -I. -I../../../../arch -I../../../.. -nostdinc -DDIAGNOSTIC -DLKM -DNMBCLUSTERS="0x1000" -DMAXUSERS=32 -D_KERNEL -Dalpha  -c ../../../../dev/ic/aic7xxx.c
panic: lockmgr: no context
Stopped at      cpu_Debugger+0x4:       ret     zero,(ra)

db> tr
cpu_Debugger() at cpu_Debugger+0x4
panic() at panic+0xfc
lockmgr() at lockmgr+0xac
uvm_fault() at uvm_fault+0x184
trap() at trap+0x37c
XentMM() at XentMM+0x20
--- memory management fault (from ipl 4) ---
tulip_tx_intr() at tulip_tx_intr+0x208
tulip_intr_handler() at tulip_intr_handler+0x3cc
tulip_intr_normal() at tulip_intr_normal+0x1c
alpha_shared_intr_dispatch() at alpha_shared_intr_dispatch+0x6c
dec_6600_iointr() at dec_6600_iointr+0x6c
interrupt() at interrupt+0x1dc
XentInt() at XentInt+0x1c
--- interrupt (from ipl 0) ---
idle() at idle+0x24
mi_switch() at mi_switch+0x1a0
ltsleep() at ltsleep+0x2d0
getblk() at getblk+0xf0
nfs_getcacheblk() at nfs_getcacheblk+0xe8
nfs_write() at nfs_write+0x47c
vn_write() at vn_write+0x154
dofilewrite() at dofilewrite+0xd0
sys_write() at sys_write+0xa0
syscall() at syscall+0x1d8
XentSys() at XentSys+0x50
--- syscall (4) ---
--- user mode ---

db> ps
 PID             PPID       PGRP        UID S   FLAGS          COMMAND    WAIT
 948              945        898          0 3  0x4006               as  getblk
 945              944        898          0 3  0x4086               cc    wait
 944              898        898          0 3  0x4086               sh    wait
 898              203        898          0 3  0x4086             make    wait
 209                0          0          0 3 0x20204            nfsio nfsrcvl
 208                0          0          0 3 0x20204            nfsio nfsrcvl
 207                0          0          0 3 0x20204            nfsio nfsrcvl
 206                0          0          0 3 0x20204            nfsio   netio
 203              198        203          0 3  0x4086              ksh   pause
 198                1        198          0 3  0x4086              csh   pause
 196                1        196          0 3    0x84             cron nanosle
 193                1        193          0 3    0x84            inetd   pause
 187                1        187          0 3    0x84             sshd  select
 117                1        117          0 3    0x84          rpcbind  select
 106                1        106          0 3    0x84          syslogd  select
 4                  0          0          0 3 0x20204          ioflush  syncer
 3                  0          0          0 3 0x20204           reaper  reaper
 2                  0          0          0 3 0x20204       pagedaemon daemon_
 1                  0          1          0 3  0x4084             init    wait
 0                 -1          0          0 3 0x20204          swapper schedul

db> sync   
syncing disks... 
fatal kernel trap:

    trap entry = 0x2 (memory management fault)
    a0         = 0x70
    a1         = 0x1
    a2         = 0x0
    pc         = 0xfffffc000057b9c8
    ra         = 0xfffffc000057b834
    curproc    = 0x0

panic: trap
Stopped at      cpu_Debugger+0x4:       ret     zero,(ra)

db> reboot
rebooting...

 ======================================================================

boot:

Entering netbsd at 0xfffffc00003010c0...
[ preserving 453536 bytes of netbsd ELF symbol table ]
consinit: not using prom console
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 1.5.3 (DEBUG) #0: Tue Aug  6 00:27:49 EDT 2002
    root@sarasvati:/local/src/NetBSD/NetBSD-1.5.3/source/usr/src/sys/arch/alpha/compile/DEBUG
COMPAQ AlphaServer DS10 466 MHz
8192 byte page size, 1 processor.
total memory = 1024 MB
(2848 KB reserved for PROM, 1021 MB used by NetBSD)
avail memory = 943 MB
using 6548 buffers containing 52384 KB of memory
mainbus0 (root)
cpu0 at mainbus0: ID 0 (primary), 21264-4 (pass 3)
cpu0: Architecture extensions: 303<PAT,MVI,FIX,BWX>
tsc0 at mainbus0: 21272 Core Logic Chipset, Cchip rev 0
tsc0: 2 Dchips, 1 memory bus of 16 bytes
tsc0: arrays present: 512MB, 512MB, 0MB, 0MB, Dchip 0 rev 1
tsp0 at tsc0
pci0 at tsp0 bus 0
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
sio0 at pci0 dev 7 function 0: Acer Labs M1543 PCI-ISA Bridge (rev. 0xc3)
de0 at pci0 dev 9 function 0
de0: interrupting at dec 6600 irq 29
de0: DEC 21143 [10-100Mb/s] pass 4.1
de0: address 08:00:2b:86:77:93
de1 at pci0 dev 11 function 0
de1: interrupting at dec 6600 irq 30
de1: DEC 21143 [10-100Mb/s] pass 4.1
de1: address 08:00:2b:86:77:a8
de1: enabling 10baseT port
pciide0 at pci0 dev 13 function 0: Acer Labs M5229 UDMA IDE Controller (rev. 0xc1)
pciide0: bus-master DMA support present
pciide0: primary channel wired to compatibility mode
pciide0: disabling primary channel (no drives)
pciide0: secondary channel wired to compatibility mode
atapibus0 at pciide0 channel 1
cd0 at atapibus0 drive 0: <COMPAQ  CDR-8435, , 0013> type 5 cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4, DMA mode 2
pciide0: secondary channel interrupting at isa irq 15
cd0(pciide0:1:0): using PIO mode 4, DMA mode 2 (using DMA data transfers)
siop0 at pci0 dev 15 function 0: Symbios Logic 53c895 (ultra2-wide scsi)
siop0: using on-board RAM
siop0: interrupting at dec 6600 irq 39
scsibus0 at siop0: 16 targets, 8 luns per target
isa0 at sio0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
lpt0 at isa0 port 0x3bc-0x3bf irq 7
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
isabeep0 at pcppi0
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
mcclock0 at isa0 port 0x70-0x71: mc146818 or compatible
siop0: switching to single-ended mode
scsibus0: waiting 2 seconds for devices to settle...
siop0: target 0 using tagged queuing
sd0 at scsibus0 target 0 lun 0: <IBM, DDYS-T09170N, S93E> SCSI3 0/direct fixed
siop0: target 0 using 16bit transfers
siop0: target 0 now synchronous at 20.0Mhz, offset 31
sd0: 8748 MB, 15110 cyl, 3 head, 395 sec, 512 bytes/sect x 17916240 sectors
siop0: target 1 using tagged queuing
sd1 at scsibus0 target 1 lun 0: <IBM, DDYS-T09170N, S93E> SCSI3 0/direct fixed
siop0: target 1 using 16bit transfers
siop0: target 1 now synchronous at 20.0Mhz, offset 31
sd1: 8748 MB, 15110 cyl, 3 head, 395 sec, 512 bytes/sect x 17916240 sectors
de0: enabling 10baseT port
root on sd0a dumps on sd0b
root file system type: ffs
swapctl: adding /dev/sd0b as swap device at priority 0
Automatic boot in progress: starting file system checks.