Subject: Re: Alpha DS10 Hanging on Generic 1.5.3 kernel
To: r.o.s.s <ross@netbsd.org>
From: Johan A. van Zanten <johan@ewranglers.com>
List: port-alpha
Date: 08/04/2002 15:01:11
---In message <20020804183114.23565.qmail@mail.netbsd.org>
>Hmm, "the 16 bytes of death". That's probably related to the uart fifo
>length in combination with a hardware or software problem dealing with
>interrupts.

It kind of had that feel, and i now have a little more information to add.

I netboot'd the machine again, mounted /dev/sd0a on /mnt and edited
/mnt/etc/rc.conf. I set rc_configured to "YES" and added:

rpcbind=YES
sshd=YES

 The boot now hangs in a slightly different place, but after the hang, i
can ping the machine, which at least indicates that the kernel isn't
completely hosed. Here are the last 24 lines of boot that i now see:

fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
mcclock0 at isa0 port 0x70-0x71: mc146818 or compatible
stray isa irq 10
stray isa irq 10
stray isa irq 10
stray isa irq 10
stray isa irq 10; stopped logging
siop0: switching to single-ended mode
scsibus0: waiting 2 seconds for devices to settle...
siop0: target 0 using tagged queuing
sd0 at scsibus0 target 0 lun 0: <IBM, DDYS-T09170N, S93E> SCSI3 0/direct fixed
siop0: target 0 using 16bit transfers
siop0: target 0 now synchronous at 20.0Mhz, offset 31
sd0: 8748 MB, 15110 cyl, 3 head, 395 sec, 512 bytes/sect x 17916240 sectors
siop0: target 1 using tagged queuing
sd1 at scsibus0 target 1 lun 0: <IBM, DDYS-T09170N, S93E> SCSI3 0/direct fixed
siop0: target 1 using 16bit transfers
siop0: target 1 now synchronous at 20.0Mhz, offset 31
sd1: 8748 MB, 15110 cyl, 3 head, 395 sec, 512 bytes/sect x 17916240 sectors
de0: enabling 10baseT port
root on sd0a dumps on sd0b
root file system type: ffs
swapctl: adding de0: enabling 10baseT port


Note that when the boot appears to hang, the cursor is on the next line
(under the "s" in "swapctl").

The machine is ping-able. However, the IP services that would normally
have been enabled during a completely healthy boot do not appear to be
responding.  Specifically, from a different functional machine, ssh returns
"Connection refused" and  "rpcinfo sarasvati" returns "rpcinfo: can't
contact rpcbind: : RPC: Timed out". 

 So though the kernel appears semi-alive because it's responds to pings,
i think the boot is still hanging.

>We should work with you to resolve the issue but for now, to get going,
>I'm thinking that switching from a serial to a video console will work
>around the problem.  Try plugging in a video card and cross your fingers
>while hoping that SRM will initialize it properly.

 I believe i have a video card in another DS10.  Unfortunately, the
connector looks like SVGA, and believe it or not, while i have around 5
Sun monitors in various stages of functionality, i do not have a single
VGA monitor in my house. :-) I'm unsure of what sort of keyboards they
take.  I have a USB keyboard, but may not have a PC-keyboard.

I'll see what PCish hardware i can scare up to give me a different
console, and i'll also try boot the same disk in a different machine.

But first, i'm going to take a nap. :-)


 --johan