Subject: Bad block 4194304
To: None <tech-kern@NetBSD.ORG>
From: Mark Brinicombe <amb@physig4.ph.kcl.ac.uk>
List: tech-kern
Date: 10/30/1996 22:14:10
Hi,
  Ok I have a feeling that this is a question that has been raised in the dim
distant past but I cannot remember the answer / solution or find it in the mail
archives.

Problem: NetBSD/arm32 frequently suffers from trashed inodes. fsck will
complain about the trashed inodes and remove them.
Linked with this is the kernel message
bad block 4194304 that pops up now and then.

Closer inspection shows that the inode i_din->di_ib[1] field is being trashed
with the value 0x00400000 (4194304)
(It is always this field and always that value)

this trashed value is being written back to disc. Everything is fine and the
file is accessable since the length of the files involved so far has meant that
this field was not actually used for block access.
Thus you are unaware of the problem until a fsck is forced for some reason,
then you find loads of corrupted inodes etc.
The bad block message only appears when you say, truncate a corrupted inode.

The rate of corruption is dependant on the amount of disc activity on the
machine etc.

Note: 0x00400000 is not going to be a valid disc address on the filesystem in
question.

Now initially I just put this down to a MD bug that I needed to trace but then
I recalled seeing some bad block error postings a long time ago (a year ?)
and have a feeling that they were for the same block number which suggested
that the problem may not be arm32 specific, just more common on this platform

[similar to the vnode bug fixed with vfs_bio.c rev 1.44 - The arm32 port had
far more problems from this bug than any other port, for a while I thought it
was an arm32 specific bug]

So, does anyone else have problems with bad blocks occuring on a strange block
number ? have there been posting the past about bad blocks numbers that are not
valid for the disc or is it my bad memory ?

Anyone have any ideas how I could track down the problem ?

My current temporary fix is a hack in ffs_inode.c to catch these trashed inodes
and patch the indirect address before writing it to disc.

Cheers,
				Mark



-- 
Mark Brinicombe				amb@physig.ph.kcl.ac.uk
Research Associate			http://www.ph.kcl.ac.uk/~amb/
Department of Physics			tel: 0171 873 2894
King's College London			fax: 0171 873 2716