Subject: quirk in or uncovered by new pciide
To: None <current-users@netbsd.org>
From: Hal Murray <murray@pa.dec.com>
List: current-users
Date: 08/12/1999 00:56:07
If I read both disks at the same time on a Promise Ultra66 chip.  
I got the following: 

  pciide0:0:0: Bus-Master DMA error: missing interrupt, status=0x20
  wd0d: DMA error reading fsbn 83640 of 83640-83643 (wd0 bn 83640; cn 82 tn 15 sn 39), retrying
  wd0: soft error (corrected)

This is stock 1.4 on an Intel box with the pciide patches to support 
the Promise chips.  I get about 1 a minute on the Ultra66 system.  
They happen on both wd0 and wd1.  (Both are masters - no slaves.)

I haven't seen any on a second system with the Ultra33 chip and slower 
disks.

I haven't seen any if I only read one drive at a time.  I haven't 
seen any troubles while reading one disk via the Ultra66 chip and 
another disk via the old/boot/normal IDE controller.  (But I haven't 
tried all that hard.)


------

I have a simple hack program that reads a whole file using open and 
read with 64K buffers.  I normally use something like /dev/wd0d for 
the filename.  It runs until it gets a short block indicating an 
end of file. 

With /dev/wd0d, it gets an error.  errno has 5 indicating "Input/output 
error" and I see the stuff below in /var/log/messages.

No surprise, but the system with the Ultra33 chip does the same thing.

So does reading /dev/wd2d - the boot disk accessed via an old builtin 
vanilla IDE chip. 

On an Alpha with SCSI disks, I get an errno of 22, "Invalid argument" 
There is a bunch of stuff in /var/log/messages for SCSI errors that 
includes "Logical Block Address Out of Range". 

  Is reading /dev/wd0d a reasonable thing to do?

  Is there a missing size-check in the /dev/wd* file handler? 

  Is "downgrading" when reading too far a reasonable thing for the 
  pciide driver to do?  Is it supposed to do the size-check?

  Is there any simple way for my program to find the length of a 
  file so it can avoid this glitch?  (I'd prefer it to work on both 
  /dev/wd* and normal files but I'll special case /dev/wd* if appropriate.) 


Aug 11 23:38:20 hgm96d /netbsd: pciide0:1:0: Bus-Master DMA error: status=0x6
Aug 11 23:38:21 hgm96d /netbsd: wd1d:  aborted command reading fsbn 44150400 of 44150400-44150403 (wd1 bn 44150400; cn 43800 tn 0 sn 0), retrying
Aug 11 23:38:21 hgm96d /netbsd: pciide0:1:0: Bus-Master DMA error: status=0x6
Aug 11 23:38:21 hgm96d /netbsd: wd1: transfer error, downgrading to DMA mode 2
Aug 11 23:38:21 hgm96d /netbsd: wd1(pciide0:1:0): using PIO mode 4, DMA mode 2 (using DMA data transfers)
Aug 11 23:38:21 hgm96d /netbsd: wd1d:  aborted command reading fsbn 44150400 of 44150400-44150403 (wd1 bn 44150400; cn 43800 tn 0 sn 0), retrying
Aug 11 23:38:21 hgm96d /netbsd: pciide0:1:0: Bus-Master DMA error: status=0x6
Aug 11 23:38:21 hgm96d /netbsd: wd1: transfer error, downgrading to PIO mode 4
Aug 11 23:38:21 hgm96d /netbsd: wd1(pciide0:1:0): using PIO mode 4
Aug 11 23:38:21 hgm96d /netbsd: wd1d:  aborted command reading fsbn 44150400 of 44150400-44150403 (wd1 bn 44150400; cn 43800 tn 0 sn 0), retrying
Aug 11 23:38:22 hgm96d /netbsd: wd1d:  aborted command reading fsbn 44150400 of 44150400-44150403 (wd1 bn 44150400; cn 43800 tn 0 sn 0), retrying
....