Subject: kern/20702: Kernel is unable to handle nic's on Cobalt RaQ2+ (R28 GF8 FIE)
To: None <gnats-bugs@gnats.netbsd.org>
From: None <quest@cistron.nl>
List: netbsd-bugs
Date: 03/14/2003 05:19:53
>Number:         20702
>Category:       kern
>Synopsis:       Kernel is unable to handle nic's on Cobalt RaQ2+ (R28 GF8 FIE)
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Mar 14 05:21:00 PST 2003
>Closed-Date:
>Last-Modified:
>Originator:     Marcel
>Release:        1.6P
>Organization:
>Environment:
NetBSD/Cobalt
>Description:
When trying to install NetBSD using a netboot, the system hangs after a few packets are sent.

RaQ Console gives the following errors:

[snap]
tlp0 at PCI0 dev 7 function 0: DECchip 21143 Ethernet, pas 4.1
tlp0: Sorry, unable to handle your board
[....]
tlp1 at PCI0 dev 12 function 0: DECchip 21143 Ethernet, pas 4.1
tlp1: Sorry, unable to handle your board
[....]


Dennis Charnoivanov came with a kernel that debugs the tulip code, giving this output on the console:

[....]
tlp0 at pci0 dev 7 function 0: DECchip 21143 Ethernet, pass 4.1
tlp0: SROM size is 2^6*16 bits (128 bytes)
SROM CONTENTS:
        0x00 0x10 0xe0 0x00 0x5f 0x6e 0x50 0x00
        0x00 0xc0 0x1b 0x74 0x00 0xc0 0x1c 0x20
        0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x6c 0x4a 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1d 0x26
tlp_isv_srom: no cksum
tlp_isv_srom: cksum failure
testing old srom...
tlp_parse_old_srom failure, 1
tlp_parse_old_srom failed, last try is: 0 c0 1c cc 0 c0
cmp: tpq(8,0,2b) vs. enaddr(0,c0,1c)
cmp: tpq(0,0,f8) vs. enaddr(0,c0,1c)
cmp: tpq(0,10,e0) vs. enaddr(0,c0,1c)
cmp: tpq(0,40,bc) vs. enaddr(0,c0,1c)
cmp: tpq(0,0,d1) vs. enaddr(0,c0,1c)
cmp: tpq(0,10,57) vs. enaddr(0,c0,1c)
mediasw is null!
tlp0: sorry, unable to handle your board

[....]

tlp1 at pci0 dev 12 function 0: DECchip 21143 Ethernet, pass 4.1
tlp1: SROM size is 2^6*16 bits (128 bytes)
SROM CONTENTS:
        0x00 0x10 0xe0 0x00 0x5f 0x6f 0x50 0x00
        0x00 0xc0 0x1b 0x74 0x00 0xc0 0x1c 0x20
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x6c 0x4a 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1c 0xcc
        0x00 0xc0 0x1c 0xcc 0x00 0xc0 0x1d 0x26
tlp_isv_srom: no cksum
tlp_isv_srom: cksum failure
testing old srom...
tlp_parse_old_srom failure, 1
tlp_parse_old_srom failed, last try is: 0 c0 1c cc 0 c0
cmp: tpq(8,0,2b) vs. enaddr(0,c0,1c)
cmp: tpq(0,0,f8) vs. enaddr(0,c0,1c)
cmp: tpq(0,10,e0) vs. enaddr(0,c0,1c)
cmp: tpq(0,40,bc) vs. enaddr(0,c0,1c)
cmp: tpq(0,0,d1) vs. enaddr(0,c0,1c)
cmp: tpq(0,10,57) vs. enaddr(0,c0,1c)
mediasw is null!
tlp1: sorry, unable to handle your board

>How-To-Repeat:
Install Netbsd (netboot) on this particular RaQ2+ (R28 GF8 FIE).
(dual nic, scsi, 256MB RAM).

>Fix:
Dennis customised the kernel so that it doesn't check the contents of the SROM. 

Succesfull, the raq did a netboot and I was able to login and ping a host.

He found the "solution" unacceptable and asked me to fill in a problem report and add the following patch (tulip.diff):

--- sys/dev/ic/tulip.c	2003/02/26 06:31:10	1.122
+++ sys/dev/ic/tulip.c	2003/03/13 20:08:17
@@ -2439,7 +2439,12 @@
 	u_int32_t cksum;
 
 	if (memcmp(&sc->sc_srom[0], &sc->sc_srom[16], 8) != 0) {
+#if !defined(TLP_COBALT_2114x)
 		/*
+		 * It appears that Cobalt 21143 boards may not conform
+		 * to the rules outlined below.
+		 */
+		/*
 		 * Some vendors (e.g. ZNYX) don't use the standard
 		 * DEC Address ROM format, but rather just have an
 		 * Ethernet address in the first 6 bytes, maybe a
@@ -2454,6 +2459,7 @@
 			    sc->sc_srom[i] != 0)
 				return (0);
 		}
+#endif
 
 		/*
 		 * Sanity check the Ethernet address:
--- sys/arch/cobalt/conf/GENERIC	2003/02/27 19:22:40	1.41
+++ sys/arch/cobalt/conf/GENERIC	2003/03/13 20:08:18
@@ -214,6 +214,7 @@
 #sip*		at pci? dev ? function ?	# SiS 900 Ethernet
 #tl*		at pci? dev ? function ?	# ThunderLAN-based Ethernet
 tlp*		at pci? dev ? function ?	# DECchip 21x4x and clones
+options TLP_COBALT_2114x
 #vr*		at pci? dev ? function ?	# VIA Rhine Fast Ethernet
 #lmc*		at pci? dev ? function ?	# Lan Media Corp SSI/HSSI/DS3
 #rtk*		at pci? dev ? function ?	# Realtek 8129/8139



>Release-Note:
>Audit-Trail:
>Unformatted: