Subject: Re: CS20 ethernet hang oddities
To: Ed Wensell III <ewensell3@yahoo.com>
From: Stephen M. Jones <smj@cirr.com>
List: port-alpha
Date: 11/08/2002 15:31:40
Ed writes:

> Probably not the problem, but the last time I saw something like this was
> when I accidentally connected the 100Mbps NIC of an AS2100/OpenVMS OpenVMS
> to a forced-to-10Mbps port [1] on a Cisco switch. Entire subnet went dead
> until disconnected.

I think it might be a related issue but more likely noise that i2c is
interpreting .. Were you able to use your machine for any length of time
before it just hosed the switch?
 
> Have you verified that the NICs and connected
> switches/routers/hubs/whatever jive with one another? Have you tried
> forcing port speed on both the NICs and switches rather than rely on
> autoselect (if possible)?

I'm actually investigating that :)  I've got the two machines working hard
tossing a large file back and forth on both interfaces to see if I can 
replicate the problem here..

Peter Petrakis wrote:

> 1) what do you have in there for PCI cards???

Each has a tekram DC390 SCSI controller, however I am not actively
using them and they can be removed.

> 2) on the primary ethernet port. Is the NIC controller
> a 82550 or 82559?

They are both i82555 rev 4 for the inphy and i82559 rev 8 for the
fxp.  The fxp driver states its for the i82557 Etherexpress Pro 10/100
and is derrived by Jason Thrope.

Whats strange is that these machines ran 3 to 5 hours before their
lock up happened .. what you say below opens my eyes:

> You see, The intel NIC is a touchy ASIC and if you're not using the
> Donald Becker driver you're really just asking for trouble. That coupled
> with the fact the the ethernet controllers have to travel over jump wire
> (the scsi cable) to do it's work. There is some chance for signal integrity
> to fail. We corrected this by basicly shouting down the wires by removing
> some resistors on the TX/RX lines. I don't remember the revision of the I/O
> module that had that fix. If yours is a production model, it has it.

Donald Becker wrote his driver for linux, correct?  I do recall seeing the
ethernet board not being part of the system board and there being a cable
to connect it .. did DEC use that in the DS20 as well?

> That's the worst case. Best case is we find that the primary nic is a 82559
> which means it's eeprom needs to be reprogrammed if it hasnt been done already.
> Otherwise it will twiddle the I2C bus intermitantly and cause all sorts of
> problems.

Hmmm.. thats a bit worry some!  Right now I've got the two machines plugged
into a 4 port micro dumb hub (10mbit) tossing a 115mbit file back and forth
Its been doing that for at least 3 hours now without an issue.

If I can't replicate the problem here and there are firmware/software mods that
can be made then I suppose I could pick up two PCI ethernet cards.. I've had
relatively stable success with the 3c905Bs in the 5305 .. however, those drivers
were patched to resolve a stray interrupt issue.

Stephen