Subject: Re: 3ware card troubles?
To: None <tls@rek.tjls.com>
From: Brian Buhrow <buhrow@lothlorien.nfbcal.org>
List: tech-kern
Date: 04/07/2003 00:00:22
	Hello.  Well, actually, I lied about the 5 seconds -- entirely too
tired.  It's actually 2 seconds.  That's still a long time, but I've found
that this card isn't as speedy as some of the smaller ones.  Plus, my hope
is that the watchdog timer is supplemental and not primary to keeping
things going.
	Initial tests are going very well.  One oddity we're seeing is that at
some point while the machine was running, it stopped updating netstat -i
statistics and load average statistics.  I don't know if this was due to a
pilot error on my part, getting the running kernel out of sync with the one
in the root filesystem, or if it was due to someinterrupt inversion going
on in the kernel.  In any case, the problem seemed to be entirely cosmetic,
and had no effect on actual running processes.  I've updated the kernel
with a minor semantic change to the twe driver, and made sure all copies of
the kernel are in the correct places.  Now, we'll let it run again and see
if we observe the same behavior again.
-Brian

On Apr 6,  6:08pm, Thor Lancelot Simon wrote:
} Subject: Re: 3ware card troubles?
} On Sun, Apr 06, 2003 at 05:33:27AM -0700, Brian Buhrow wrote:
} > 	Hello.  Following up on my own post I realized after I wrote the last
} > message that I had it completely backward and all is well in the ld driver.
} > In fact, Andrew is completely correct that the 3ware card is losing
} > interrupts and failing to acknowledge requests to the higher level drivers.
} > Whether this is due to having interrupts blocked at the time they come in,
} > or if the card is just going to sleep, I can't tell.  In any case, I wrote
} > a watchdog routine (twe_watchdog) which is setup to yank on the card if it
} > hasn't acknowledged a request in 5 seconds.  I've installed a kernel with
} 
} 5 seconds seems like far too long.  1 second or even 1/2 second seems much
} better; polling the card isn't much work, after all.  Hell, 1/10 second
} might be acceptable; it's hard to see why _not_.  If you have the problem,
} best to catch it _fast_.
} 
} > this change on our machine, and it immediately works better thanit's worked
} > in the last three weeks.  We'll beat on this a bit, and, if all goes well,
} > we'll ask about getting it pulled into the main source tree.  Whois the
} > best person to do that for us?
} 
} Andy, I think.
} 
} Thor
>-- End of excerpt from Thor Lancelot Simon