Subject: 7500-class boot problems (probably not MESH-related after all)
To: None <mlr@rse.com>
From: Monroe Williams <monroe@pobox.com>
List: port-macppc
Date: 10/26/2000 03:35:34
I'm copying this to the list, since it might be useful to others.

Here's the actual info on the stack of processor cards I have at home:

1 - Apple 601 card (didn't even bother)

1 - Power Computing short form factor 604 card labelled "225/45" which
    doesn't work (don't even get a startup chime with it in the machine)

1 - Apple 604/120 card
    dmesg: "CPU: 604 (Revision 303)"

1 - PowerLogix G3 card labelled "PowerForce G3 220/110/512K"
    (This card has adjustable bus speed and multiplier, but has a bad case
    of the "speculative processing" problem)
    dmesg: "CPU: 750 (Revision 202)"

1 - Newer Tech G3/300 card labelled "M300/200"
    dmesg: "CPU: 750 (Revision 202)"

I have another 604 card in my test machine at work. I believe it's 233MHz,
and I'm not sure whether it's a 604 or a 604e.  I'll plan to bring it home
tomorrow and test it as well.

The netbsd.EASTERN-1.4S kernel boots fine with the 604/120 card.

The Newer Tech card and the PowerLogix card running at 300MHz (the speed I
usually ran it at before I retired it in favor of the the Newer Tech card)
both give the following:

scsibus0: waiting 2 seconds for devices to settle...
scsibus1: waiting 2 seconds for devices to settle...
probe(mesh0:0:0): Sense Error Code 0x0
mesh: timeout state=3
mesh: resetting dma

and they're done.

It just so happens that I can turn the PowerLogix card all the way down to
120MHz (40 MHz bus, 3:1 multiplier).  When I do so, booting the
netbsd.EASTERN-1.4S kernel still fails, but differently:

scsibus0: waiting 2 seconds for devices to settle...
scsibus1: waiting 2 seconds for devices to settle...
probe(mesh0:0:0): Sense Error Code 0x0
sd0 at scsibus1 targ 0 lun 0 <, , > SCSI0 0/direct fixed
sd0: mode sense (4) returned nonsense; using ficticious geometry
sd0: 8715 MB, 8715 cyl, 64 head, 32 sec, 512 bytes/sect x 17850000 sectors
sd1 at scsibus1 targ 1 lun 0 <, , > SCSI0 0/direct fixed
sd1: mode sense (4) returned nonsense; using ficticious geometry
sd1: 1222 MB, 1222 cyl, 64 head, 32 sec, 512 bytes/sect x 2503872 sectors
probe(mesh0:3:0): Sense Error Code 0x0
sd0 at scsibus1 targ 3 lun 0 <, , > SCSI0 0/direct fixed
sd2(mesh0:3:0): Sense Error Code 0x0
sd2: drive offline
boot device: sd1
root on sd1a dumps on sd1b
sd1(mesh0:1:0) Sense Error Code 0x0
sd1(mesh0:1:0) Sense Error Code 0x0
sd1(mesh0:1:0) Sense Error Code 0x0
sd1(mesh0:1:0) Sense Error Code 0x0
sd1(mesh0:1:0) Sense Error Code 0x0
sd1(mesh0:1:0) Sense Error Code 0x0
sd1(mesh0:1:0) Sense Error Code 0x0
sd1(mesh0:1:0) Sense Error Code 0x0
sd1(mesh0:1:0) Sense Error Code 0x0
sd1(mesh0:1:0) Sense Error Code 0x0
sd1(mesh0:1:0) Sense Error Code 0x0
sd1(mesh0:1:0) Sense Error Code 0x0
no file system for sd1 (dev 0x410)
cannot mount root, error = 79
root device (default sd1a):

Yes, it's prompting me for the root device.  I'm afraid to let it go any
further, lest it piss all over my working root disk.  The capacities of the
devices are correctly retrieved, but something is obviously quite bent.

I was able to scare up more disk space by reappropriating a partition on
another disk that had LinuxPPC installed on it.  ;)

I've retrieved from cvs the sources as of 20000301-UTC and built a kernel
from them.  It fails in the same way as the rest.  I've also built a kernel
from 20000201-UTC sources, and it gets all the way through booting.  Next,
I'll try 20000215-UTC. (See where this is going?)  At some point, I should
have two source trees that are within a couple of days (preferably one day)
of each other, one functional, one not.  This should give us a pretty good
idea which changes to examine more closely.

Not exactly the elegant approach, but it should bloody well isolate the
problem.

Tired of waiting,
-- monroe
------------------------------------------------------------------------
Monroe Williams                                         monroe@pobox.com