Port-prep archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: NetBSD 8.10 does not boot on 7043-140



Howdy Uli,
my guess would be to turn on the DDB or KGDB option in the _COM version of the kernel (compile) and try to debug it that way.
http://www.netbsd.org/docs/kernel/kgdb.html

I have recently tried to compile with the KGDB kernel option enabled, but it fails without a few workarounds (I will be making a separate post on this shortly). For this _COM issue, it might be best (once KGDB is fully working) to setup the 7043 to boot from network BOOTP/TFTP and use a separate (more powerful) machine for KGDB REMOTE, cross-compiling, and TFTP/BOOTP server. This should allow you to cross-compile the kernel on the REMOTE machine and push it to the 7043 by simply copying the cross-compiled kernel to the /tftp directory on the same machine. The kernel source would also be on the same machine for KGDB to use, so it seems like this is probably an ideal setup for a physical machine. There's probably a way to do it through QEMU, but I don't really have any experience with it.

I've only seen this problem with the _COM versions of the kernel, so when you setup KGDB, you might want to have it use COM1 and have TERM on COM0 to see what's going on.

Like you, I don't have much experience with the kernel (other than compiling), so much of the actual debugging is new to me.

On 5/31/19 5:36 AM, Ulrich Teichert wrote:
Hi,

as others have already observed, NetBSD is broken on the IBM 7043-140:

https://mail-index.netbsd.org/port-prep/2017/04/02/msg000109.html [1]
https://mail-index.netbsd.org/port-prep/2017/06/27/msg000119.html [2]
https://mail-index.netbsd.org/port-prep/2018/08/27/msg000131.html [3]

Artyom Tarasenkos guess was that the breakage exists since 3.99.22,
but his quick fix (from [1]) does not work anymore.

The bootup of my 7043 (on a serial console) looks similar to [2]:

NetBSD/prep BOOT, Revision 1.9 (Tue Jul 17 14:59:51 UTC 2018)
INTRF
phase mismatch without command
selection timeout without command
unhandled scsi interrupt, sist=0xffff sstat1=0xff DSA=0xffffffff DSP=0x7f7e8fff
XXXXX: fatal error, need reset the bus...
INTRF
phase mismatch without command
selection timeout without command
unhandled scsi interrupt, sist=0xffff sstat1=0xff DSA=0xffffffff DSP=0x7f7e8fff
XXXXX: fatal error, need reset the bus...
INTRF

... And so on until you'll switch off power. I am pretty sure that this
is not a hardware fault, as the same machine boots Linux just fine and
the other boot failures are looking very similar, ever the one under qemu.

Can somebody point me in the right direction where to start digging?
I mean, the INTRF and "phase mismatch without command" prints are most
probably from sys/arch/prep/stand/boot/siop.c, function siop_intr, but
as this is my first look into the NetBSD kernel tree, I'd be glad for
every help ;-)

TIA,
Uli



Home | Main Index | Thread Index | Old Index