Subject: possible bug in disk driver raw device interface?
To: NetBSD/i386 Discussion List <port-i386@netbsd.org>
From: Greg A. Woods <woods@most.weird.com>
List: port-i386
Date: 01/22/1999 18:21:28
This message may belong on current-users, but I've only been able to
test it on a 1.3.3 i386 machine to date.

For the first time ever with NetBSD I've done some >4GB disk copies
using dd, and they have not worked -- I've had to do them in chunks (I
just re-started with the closest overlapping NetBSD partition, though
now that I've done further analysis it seems I could have done OK by
simply doing sector-by-sector copies of the last hunk).

I suspect what's happening is that even though I'm using the character
device, it's enforcing blocking at the size of the read and the driver
is returning a fail instead of just returning the remaining bytes as it
probably should.  At least it's my expectation that a read on a raw
device should return the remaining bytes of a device even if the
requested read is larger.  This seems to be the case for floppies,
tapes, and other devices, at least for sequential reads, and I expect no
difference for raw fixed media.

The host adapter is an AIC-7880 chip on the motherboard:

	ahc0 at pci1 dev 4 function 0
	ahc0: interrupting at irq 14
	ahc0: Using left over BIOS settings
	ahc0: aic7880 Wide Channel, SCSI Id=7, 16 SCBs
	scsibus0 at ahc0 channel 0: 16 targets

Here's a classic example of trying to copy a pair of 9BG (IBM DGHS09U)
drives where it fails about 80 sectors from the end:

	# dd if=/dev/rsd0d of=/dev/rsd1d bs=64k 
	dd: /dev/rsd0d: Invalid argument
	139970+0 records in
	139970+0 records out
	583139328 bytes transferred in 2056 secs (283628 bytes/sec)

(of course the stats above are meaningless because dd doesn't use a
64-bit counter for bytes and rate calculations)

The drives for the above are probed as:

	ahc0: target 0 using 16Bit transfers
	ahc0: target 0 synchronous at 10.0MHz, offset = 0x8
	ahc0: target 0 Tagged Queuing Device
	sd0 at scsibus0 targ 0 lun 0: <IBM, DGHS09U, 0350> SCSI3 0/direct fixed
	sd0: 8748MB, 8152 cyl, 10 head, 219 sec, 512 bytes/sect x 17916240 sectors
	ahc0: target 1 using 16Bit transfers
	ahc0: target 1 synchronous at 10.0MHz, offset = 0x8
	ahc0: target 1 Tagged Queuing Device
	sd1 at scsibus0 targ 1 lun 0: <IBM, DGHS09U, 0350> SCSI3 0/direct fixed
	sd1: 8748MB, 8152 cyl, 10 head, 219 sec, 512 bytes/sect x 17916240 sectors

The console displays the following SCSI error when dd croaks:

	sd0(ahc0:0:0):  Check Condition on opcode 0x28
	    SENSE KEY:  Illegal Request
	   INFO FIELD:  17916240
	     ASC/ASCQ:  Logical Block Address Out of Range

Similarly when copying a pair of 4.5GB drives (IBM DDRS-34560W) dd
failed after 4357 1m blocks or 69726 64k blocks (presumably a mere 72
sectors from the end):

	sd0 at scsibus0 targ 3 lun 0: <IBM, DDRS-34560W, S71D> SCSI2 0/direct fixed
	sd0: 4357MB, 8387 cyl, 5 head, 212 sec, 512 bytes/sect x 8925000 sectors

	sd0(ahc0:0:0):  Check Condition on opcode 0x28
	    SENSE KEY:  Illegal Request
	     ASC/ASCQ:  Logical Block Address Out of Range     SKSV:  Error in CDB, Offset 2.

(I don't remember seeing an "INFO FIELD" in that one....)

The SCSI driver probably shouldn't even complain about an out-of-range
read if the raw device is only requesting the next sequential read and
hasn't used lseek() to try to wander out past the physical end of the
media.

-- 
							Greg A. Woods

+1 416 218-0098      VE3TCP      <gwoods@acm.org>      <robohack!woods>
Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>