Subject: Re: mesh driver
To: None <tmpowell@yellowsub.net, mw34@cornell.edu>
From: M L Riechers <mlr@rse.com>
List: port-macppc
Date: 08/02/2000 21:40:15
Thomas Powell <tmpowell@yellowsub.net> queries:
	s
> Is the mesh driver functional? 

>on 8/1/00 10:00 PM, Jake Luck at netbsd@10k.org wrote:
>
>> Monroe William, a few other unfortunate ones and I all have been plagued
>> with the mesh problem on boot. In my case it is a Powerbook 2400c(I guess
>> they shouldn't list that model on the supported list anymore). I have
>> meant to go dig in a bit deeper to find out why but hasn't had the time.
>> Apparently the scsipi_probedev() returns a buffer filled with 0, making
>> driver id and attachment impossible. 20000205 snapshot has a working
>> kernel if i recall correctly.
>
>  Monroe Williams <monroe@pobox.com> then sayeth:
>
>  That's what I'm running on my 7500.  Every later kernel I've built or
>  downloaded has been non-functional as tmpowell@yellowsub.net described.

It is functional with NetBSD current 1.4ZB 10 June 00 on our 7500
upgraded to a PPC 604 (Revision 303).  NetBSD-current was
non-functional from late Feb or early March until ~ 10 June, but I
don't think that had anything to do with the SCSI subsystem.

Early on (Jan-March 99) we had a fair lot of trouble with the MESH
system.  We had installed a 9 gig drive for NetBSD on the SCSI bus
between the MESH controller and the ~500 Meg Quantum (MacOS) drive.
To set up the 9 gig NetBSD drive to run we switched power to it, and
de-switched the Quantum.  (Powered off, off course.)

That should have worked.  Unfortunately, the terminators on the
unpowered Quantum drive were inactive (or unpowered, take your
choice), so we would from time to time get the infamous "Panic: mesh:
FIFO != 0" MESH message. We solved the problem by de-activating the
Quantum terminators, re-making the SCSI cable longer and with more
connectors (and also ensuring at least 1 foot of cable _between_
connectors as per spec), and slapping a good, _active_ terminator on
the end connector.  Now, our powered/unpowered drives work just fine.

Logically, the infamous "Panic: mesh: FIFO != 0" MESH message can and
should be generated in some circumstances by bad or no termination.
Lack of termination causes "ringing" on the line; that is, an
assertion or de-assertion echos up and down the signal lines from the
end of the lines. These echos can, if they happen to fall within
certain SCSI timing specs, be interpreted by a receiver as being valid
handshake control signals.  That would stuff the FIFO (First In, First
Out buffer) with bogus bytes, and, for instance, make it seem we read
a sector of 513 bytes -- one to many when it comes to the end of the
FIFO transfer.

Also, in the early days of '99 we had big trouble if we didn't id our
NetBSD drive SCSI 0.  If we id'd it say, 1, then some parts of the
SCSI system would confusedly assume it was dealing with drive 0, and
ruin your whole day.  So, we id'd both drives at 0.  I don't know if
this has been corrected -- we don't use a second drive with NetBSD,
and it's worked fine since.

Somebody at Apple apparently thinks that keeping the cable short
overcomes the need to keep to spec. (and, by the way, saves on paying
for cable).  To a point, they're correct: if you can keep the cable so
short it's not a "signal" cable, then you don't have to worry about
termination.  But, that's pretty darn short, and SCSI depends also on
terminators to keep the SCSI lines at the proper voltages when some
device is not asserting.

I looked on the motherboard, and couldn't identify terminator IC's,
resistors, or whatever (couldn't identify much of anything really -
maybe the good stuff is on the other side of the board;  didn't really
want to dis-assemble the box to that extent).  So what's a 7500 got
for termination on the MESH side?  (Can you put the MESH in the middle
of the bus?  Don't check the documentation -- it'll tell you to take
the whole box to your friendly local Apple dealer).  I sure don't
know.  So, it would seem to me best to get the best termination you
possibly can on the other end.

There are other gotchas.  The thing that was supposed to power the
terminators on the Quantum drive is power in the cable called
"termpwr".  This is normally provided by the "biggest guy" -- the MESH
end in this case -- but it's up to the implementor, owner, or user (in
that order?) to decide where termpwr is coming from.  Because, it's
generally better to power all termination from the same source and
voltage level, and if you don't have it, your neat stand-alone active
terminators won't work.  But if you get _two_ sources of termpwr,
you're in for a heck of a rough ride: then you're trying to feed two
sources of +5V into each other; at best, that causes many strange
problems; at worst, it causes fires.

We've blown many a SCSI termpwr fuse on our 162 cards until we found
the small print on how to disable termpwr powering on late model
Conner (apparently Seagate's better idea) drives. It's quite
incredible that they were actually shipping these drives set up with
termpwr powered from the drive by default.  Maybe lost a drive that
way.  Don't know what people do in the real world of w*ndows.  Just
breaks, I guess.

But the most hilarious example was a SCSI scanner (from a large and
famous company to protect the guilty I won't name) model HP-4c we
attempted to set up.  This critter gave us fits, just wouldn't work at
all, until it dawned on us to check if it were providing termpwr.
Sure enough, it was.  So, we checked the manual.  We checked all over
the box.  We disassembled the thing.  There was _no_ _way_ _to_ _turn_
_it_ _off_.  So, we ended up putting the scanner on the _end_ of the
SCSI bus, (used its internal termination), split the termpwr wire out
of the ribbon cable just before the connector to the scanner, and cut
it out.

This works for us, it's been working fine ever since.  I don't know
what other people do, however.

Thomas Powell <tmpowell@yellowsub.net> furthereth propound:

> I wonder if disconnecting the internal jaz would help - I can boot
> directly into NetBSD, and after transferring all the base sets, am able
> to untar and begin to set up the config files, but invariably end up
> locked up with the mesh error. 

By all means, disconnect anything in site that you don't need.  (Just
be sure it wasn't providing termination).  God only knows what these
things might be doing.

There is a section in the FAQ which runs:

  Panic: mesh: FIFO != 0 (top)
                 
          If you have an external Zip drive on a machine with the MESH
          SCSI chip, you may need to unplug the Zip drive to boot
          successfully.
                 
          Some people have also seen this bug without Zip drives. One
          person suggested that the MESH driver is more reliable if
          you don't reboot from MacOS into NetBSD. (That is, if you're
          running MacOS, shut down rather than rebooting, and then
          power it back on and boot to NetBSD.)

which, if memory serves, is reflective of a number of devices that may
or may not have worked with the MESH.  I don't know.  I have no
intention of testing any form of Zip drive.  But it is a most
excellent idea to rid the bus of anything you don't need.

And, anything you do need, test.  Don't assume that the manufacturer's
have done the reasonable, or even right thing with the SCSI bus.
Don't assume that just because a device works with MacOS, it should
work with NetBSD; the MacOS and driver writers have massive amounts of
time and resources, and access to documentation to tweak the software
to hide the effects of the dollars saved on the hardware.

So, my advice:

1.  Make sure that you have at least a good terminator on the free end
    of the SCSI cable -- active if you can swing it.

2.  Make sure that the terminator is powered. (termpwr).

3.  Make sure that termpwr is provided from one and only one source.

4.  Consider re-making your cable for more connectors and to spec.

And this is for SCSI I, II or III single-ended.  If you've got
differential or lvd, that's a whole 'nuther story.

-Mike