Subject: wierd ccd problem
To: None <current-users@NetBSD.ORG>
From: Phil Knaack <flipk@idea.exnet.iastate.edu>
List: current-users
Date: 12/23/1996 15:07:04
[This is posted to current-users because I don't think its port-i386 
 specific; however, if someone thinks it is, please redirect the
 discussion there.]

Greetings:

I have an interesting problem with a ccd I've been using for about two
months now. I have two identical 200M drives on the second IDE controller
configured as a ccd using the ccd.conf below.

Firstly, ccd has been working wonderfully, I have pounded on it pretty
heavily since I set it up (my source tree is there, and I've built the
system from scratch a couple of times on the ccd).

But I currently have a 60M file on the ccd with a strange property:
any process which is reading the file sequentially makes it fine until
it is 24166400 bytes into the file (exactly, every time, repeatable). 
At this point, the process stops, and is put into 'D' wait.  Any other
accesses to the ccd, as long as they're in another directory, work
fine. All accesses to the same directory as the original file are also
stopped and enter 'D' wait. Nothing can be done with the stuck
processes. They never leave 'D' wait.  Also, a 'reboot' command (after
an abnormally long period of time) finally brings the system to a 
"Syncing disks..." step, which then hangs forever, requiring a hard
reboot to bring the system the rest of the way down.

As I said, this process is perfectly readable. Also, "fsck -f"
reports no problems with the partition at all, and with the ccd
unconfigured, the simultaneous commands

	% dd if=/dev/wd1e of=/dev/null &
	% dd if=/dev/wd2e of=/dev/null &

produce no errors and a very fine transfer rate, so there must not be
any problems with the surfaces of the disks.

The contents of the file are not horribly important; it is a backup of
the disk of a win95 box which has another backup elsewhere. However,
if this situation is easily explained, I'd like to know the
explanation. If its truly a previously undiscovered bug, what steps
would people recommend I take to gather more information?

Cheers,
Phil

	% ls -l    [the original on wd1/wd2 ccd; yes, it is mode 0]
	----------  1 flipk  wheel  59803585 Nov 30 15:49 idea1.tar.gz

	% ls -l    [the copy I was trying to produce on wd0; also mode 0]
	----------  1 flipk  wheel  24166400 Dec 21 21:44 idea1.tar.gz

	% cat /etc/ccd.conf
	# ccd           ileave  flags   component devices
	ccd0            32      0       /dev/wd1e /dev/wd2e

	% dmesg | egrep 'wd[012]'
	wd0 at wdc0 drive 0: 2014MB, 4092 cyl, 16 head, 63 sec,
		512 bytes/sec <WDC AC22100H>
	wd0: using 16-sector 16-bit pio transfers, lba addressing
	wd1 at wdc1 drive 0: 202MB, 989 cyl, 12 head, 35 sec,
		512 bytes/sec <WDC AC2200F>
	wd1: using 8-sector 16-bit pio transfers, chs addressing
	wd2 at wdc1 drive 1: 202MB, 989 cyl, 12 head, 35 sec,
		512 bytes/sec <WDC AC2200F>
	wd2: using 8-sector 16-bit pio transfers, chs addressing


--
Phillip F Knaack
Systems Administrator, Information Development for Extension Audiences (IDEA)
Iowa State University Extension