tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Unallocated inode



EF> Is there an mpt(4) controller involved? If yes, did you get any timeouts on it?

PR> Nope, no mpt(4). All boring directly connected SATA thru ahcisatai(4).
PR> However, I notice upon closer inspection, I did get a SATA timeout the
PR> night the corruption was noticed:
PR> 
PR> Nov 23 03:47:10 slave /netbsd: wd0a: device timeout writing fsbn 2165563424 of 2165563424-2
PR> 165563455 (wd0 bn 2165563488; cn 2148376 tn 7 sn 39), retrying
PR> Nov 23 03:47:15 slave /netbsd: ahcisata0 port 0: device present, speed: 3.0Gb/s
PR> Nov 23 03:47:15 slave /netbsd: wd0: soft error (corrected)
PR> 
PR> Next is to figure out the offset of the corrupted inode and see if
PR> this is in the vicinity... (I doubt it - it's a 31 sector write, for
PR> starters).

MB> Also, the drive reported success for the retried write so I would expect
MB> it to be OK. 
One could expect that, yes. Though...
In my (mpt) case the problem was probably the driver not dealing with the 
timeout correctly. I suspect that the operation the driver considered timed 
out was eventually performed by the IOC anyway, only the memory address 
originally given to it was now containing different data. So there may have 
been garbage written while the OS thought nothing had been written at all.

MB> ahcisata0 port 0: device present, speed: 3.0Gb/s
MB> means that the drive did get a soft reset. 
MB> I hope this didn't cause it to drop its cache content.
That would be another explanation.

I would consider the possibility that some defiency in ahcisata(4) dealing 
with the timeout may have contributed to the problem.


Home | Main Index | Thread Index | Old Index