Subject: Re: disks write-back cache
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: Patrick Welche <prlw1@newn.cam.ac.uk>
List: tech-kern
Date: 04/29/2003 09:31:54
On Mon, Apr 28, 2003 at 11:15:39PM +0200, Manuel Bouyer wrote:
> On Sun, Apr 27, 2003 at 03:05:42PM +0100, Patrick Welche wrote:
> > 
> > This is fortuitous! Since the new ahc driver causes rubbish to be written
> > to my disks, at least this is something to try: tagged queueing is enabled,
> > and so is write-back cache... Will try switching off write-back cache
> > on Tuesday...
> 
> Well, I've done some tests on my sparc64, and I've not been able to reproduce
> this problem ...
> 
> However, I've seen once what I believe was a (SCSI) disk with a bad cache.
> Once in a while I would get corruption of a read-only filesystem (mounted
> read/write, but almost never written to). The affected blocks were always
> the same. The disk didn't report any error, and the others disks on the same
> bus didn't have this problem. I remplaced the disk with another one, of the
> exact same model, and I've not seen this problem since.

I just replace the kernel with an old one, and have done another
3 successful make releases on that machine, so I really can't
believe it's the disks which are at fault.. (With a new kernel
the disklabel of one of the disks was even changed(!)) The best
I can think of is that maybe the new ahc driver reorders commands
more agressively? Is it possible? Does it expect the "yes I've
written it" from the disk to be true?

Anyway, I'll try switching off write-back to see if that improves
things, and then try switching off tagged queueing. I'm assuming
it's safe to install a new fsck binary with an old (16th April)
kernel to help patch up the disks after they have been messed up
by the new kernel..

Cheers,

Patrick