Subject: Re: filesystem issues after rude powerdown
To: None <netbsd-users@NetBSD.org>
From: Ignatios Souvatzis <ignatios@cs.uni-bonn.de>
List: netbsd-users
Date: 08/03/2004 11:34:06
--BwCQnh7xodEAoBMC
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi,

On Mon, Aug 02, 2004 at 09:07:44PM -0400, Brian wrote:
>=20
> So, for a second time a power outage ran down my UPS and the Qube had=20
> its power terminated rudely.
>
> [...]
>=20
> This is Michigan!  and we have t-storms, and I don't really want to go=20
> thru this every time there's a power outage that lasts long enough to=20
> drain my UPS battery.

Then you should run something that listens to the UPS and shutdowns the
machine cleanly when the time is near. All a battery-powered UPS is supposed
to do is to help you survive short power failures, give you time to start
your diesel generators, or do a clean shutdown. If you insist to survive
longer outages on battery, you need ... bigger batteries.

> -I have a few disk sectors that cannot be read (uncorrectable data=20
> error, anyway)- 756-768 inclusive.  These are the same ones as last=20
> time. I guess they really are bad- but then why are they not=20
> permanently mapped out after the last fsck_ffs cleared up issues?=20

Uhm... mapping out bad blocks is a function of modern disks (IDE as well
as SCSI). However, this might be configures off for your driver, or might
only happen when you _write_ them, as the disk can not know what to write
into the remapped blocks when it can't read the original ones.

With SCSI, you could used the "reassign" subcommand of the "scsictl" (8)
program to enforce reassignment of those blocks. With IDE, we don't have
any ready tool, I think, and I don't know whether there are disk commands
to do this at all.

Assuming (check that!) that the error message was from the driver, and=20
refers to disk block numbers (as from the file system, and refers to=20
filesystem sectors), you could try to=20

umount /tmp (in single user mode, obviously)

/sbin/sysctl kern.rawpartition

if it is 3:
dd bs=3D512 count=3D13 if=3D/dev/zero seek=3D756 of=3D/dev/rwd0d (on i387

if it is 2:
dd bs=3D512 count=3D13 if=3D/dev/zero seek=3D756 of=3D/dev/rwd0c=20

After that you'll have to "fsck -f" the affected file system.

You do this at your own risk; read the manual pages until you understand
what those commands do. Especially, as you didn't show the original=20
error message, I have no idea whether it really referred to disk blocks
or filesystem sectors (which would be relative to the partition boundary,
and using units!)

Assuming those are disk blocks, you can enforce reading of those sectors -
to see the error message again in /var/log/messages and on your console
(window) - by:

dd bs=3D512 count=3D13 if=3D/dev/rwd0d of=3D/dev/null (or /dev/rwd0c, see a=
bove).

Hope to have helped.
	Ignatios Souvatzis

--BwCQnh7xodEAoBMC
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: 2.6.i

iQEVAgUBQQ9cDDCn4om+4LhpAQFq4AgAm3qh0kFMagjt8tiuZmqrfMwb67NHyQDA
+gpjnCIKB4xo2RBKrR36i/ACYEJ/kexTP1ANizZ6tNWElcorQj0NtrJh+/wNSFmT
PRLn+U35kYyxcwJon17FDVOMSj4GGoCr7lCU3WGVWyWPyvrl73SYhnJgH2rGJ+6f
85b2WzIJ2qgib4kEG8XvN0W4jUIl7Xrp877291BFc7lLaarpZZapikuVlyVT/5jq
kxBk3Xd0XyleFjz2WRdn1lW0ELm7yFdeRFKkFnqidoUoPm0Mv+tPYViqwjWnePff
8OeFifUadY87NL+8ofQu2Mc4kwq27gHFGYxotHcoEp475TZWlj4N5Q==
=Mcaq
-----END PGP SIGNATURE-----

--BwCQnh7xodEAoBMC--