Subject: Re: -current on Ultra 5+ - now it's major siop0 lossage
To: None <port-sparc@NetBSD.ORG>
From: None <eeh@netbsd.org>
List: port-sparc
Date: 01/25/2001 18:07:58
	I'm not having much luck here, kids ...

	First I tried the January 14th snapshot kernel.  I still get the rash of

	hme0: invalid packet size 2048; dropping

	errors as described previously, when using the Ethernet.  And I started
	seeing these:

	an 25 05:05:21 netbsd4me /netbsd: DMA IRQ: bus fault dma fifo empty, DSP=0xc0069fec DSA=0xc006df00: last msg_in=0x0 status=0xff
	Jan 25 05:05:21 netbsd4me /netbsd: siop0: scsi bus reset
	Jan 25 05:05:21 netbsd4me /netbsd: cmd 0x1a3ed00 (target 0:0) in reset list
	Jan 25 05:05:21 netbsd4me /netbsd: cmd 0x1a3e888 (target 0:0) in reset list
	Jan 25 05:05:21 netbsd4me /netbsd: cmd 0x1a3ed00 (status 2) about to be processed
	Jan 25 05:05:21 netbsd4me /netbsd: cmd 0x1a3e888 (status 2) about to be processed
	Jan 25 05:05:21 netbsd4me /netbsd: siop0: target 0 now synchronous at 20.0Mhz, offset 16

	every once in a while, on disk transfers over the SCSI bus.  (Actually, I
	checked my messages logs, and I've gotten these ever since using the vanilla
	1.5 GENERIC kernel, but back then it only happened once or twice per each
	reboot cycle.)

[...]

	I did what you asked, Eduardo; I upgraded (well, my kernel anyway) to -current.

	What now?

I have not had direct access to PCI machines for some time now, so 
I'm a bit limited in the amount of help I can provide. 

I seem to recall having seen the 2048 packet size problem
before and that it was related to inbterrupt latency problems,
but I may be mistaken.  Paul Kranenburg did the HME driver, 
so the best thing to do would be to send him email on the 
subject.

As far as the siop problems are concerned, they are new to
me.  Manuel has added tag queuing to that driver recently,
so you should make certain he's aware of the problem.

Having said all that, this could be an issue with the PCI
controller.  What sort of machine is this?  So far we have
mostly been dealing with machines that have UltraSPARC IIi
processors with the on-board PCI controller.  If you have 
a machine with an UltraSPARC II and a psycho or psycho+, 
(first congratulations for having gotten it to boot) 
you probably have issues with the PCI drivers and the
IOMMU's streaming buffer cache.  Try disabling it.

Eduardo