Subject: disk translation issues, "CANNOT READ: BLK 16386"
To: None <port-cobalt@netbsd.org>
From: Jamie Heilman <jamie@audible.transient.net>
List: port-cobalt
Date: 06/23/2001 19:14:17
So I've been fighting with NetBSD on a couple of Raq1 machines now for a
while, and haven't really gotten anywhere with it.  Two of the machines are
stable and will run without problems provided I don't try to fsck the
ext2fs partitions, or run fdisk, if I do they panic; see archives for more.
I'm in the processess of bringing up a third machine and I keep having this
really annoying issue with disk geometry and fsck_ext2fs.

Its entirely possible I'm doing something inane because I just don't
fully understand all the issues around LBA and ATA restrictions, but here's
the meat of my problem, upon boot I invariably get:

CANNOT READ: BLK 16386
/dev/rwd0e: UNEXPECTED INCONSISTENCY; RUN fsck_ext2fs MANUALLY.
THE FOLLOWING FILE SYSTEM HAD AN UNEXPECTED INCONSISTENCY:
        ext2fs: /dev/rwd0e (/stand)

Then I have to start up sh and exit because trying to fsck that partition
is a waste of time.[1]  This obviously won't work in a production setting,
now I can always just tell it not to check that partition on boot if I have
to but I'd rather get to the bottom of this.  The filesystem is clean, I
can fsck it from linux without any problems.  Yes, I created it with -O none
and -r 0.  I've used both netbsd's ext2fs utils and Linux's to create the
filesystem with the same results.  I've used fdisk in linux, fdisk in
netbsd, and cfdisk in linux to create the BIOS partition, all with the same
results.  I've run various checks on the disk looking for flaws, they all
return with no errors.  The only funky behavior I've seen so far is with
respect to the drive geometry, so thats what I've focused on trying to
tinker with to fix this problem.  For all I know they may not be related,
but its looking like my best bet.

The disk is a Quantum Fireball EL, 10-ish gig IDE disk.  It is identified
in the dmesg as:
wd0 at pciide0 channel 0 drive 0: <QUANTUM FIREBALL EL10.2A>
wd0: drive supports 16-sector pio transfers, lba addressing
wd0: 9787 MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 20044080 sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2

My understanding is that the above reflects the 8GB limit from the BIOS or
from the ATA spec or wherever that braindamage originated.  The in-core
disklabel for this disk, as presented by fdisk and disklabel reflected the
16383/16/63 geometry.  Changing the disklabel to 19885/16/63 (this actually
reflects 20044080 sectors) didn't help.  AFAIK most the of these values
only matter during boot-time on PCs and once the OS has taken over its all
moot.  Let it be known I have no problem booting this machine with a 7M
ext2fs partition at the head of this disk or at the tail.  fsck_ext2fs
however always freaks at block 16386.

Other relevant info:  I'm currently bringing this disk to life on a qube2
with the 1.5 generic kernel, I'll transfer it over to the raq1 tonight and
try some things using the 1.5 generic-dma kernel to see if the problem
persists or gets worse (as I suspect it may).

-- 
Jamie Heilman                   http://audible.transient.net/~jamie/
"Paranoia is a disease unto itself, and may I add, the person standing
 next to you may not be who they appear to be, so take precaution."
						-Sathington Willoughby