current-users: Re: Soekris Net4801 vs. -current

Subject: Re: Soekris Net4801 vs. -current
To: None <current-users@NetBSD.org>
From: Alan Barrett <apb@cequrux.com>
List: current-users
Date: 10/20/2004 13:07:54
On Wed, 20 Oct 2004, Peter Seebach wrote:
> Anyone seen this?

> geodeide0:0: lost interrupt
>         type: ata tc_bcount: 512 tc_skip: 0
> geodeide0:0:0: bus-master DMA error: missing interrupt, status=0x21
> wd0d: DMA error reading fsbn 0 (wd0 bn 0; cn 0 tn 0 sn 0), retrying
> geodeide0:0: lost interrupt
>         type: ata tc_bcount: 512 tc_skip: 0
> geodeide0:0:0: bus-master DMA error: missing interrupt, status=0x1
> wd0d: DMA error reading fsbn 0 (wd0 bn 0; cn 0 tn 0 sn 0), retrying
> geodeide0:0: lost interrupt
>         type: ata tc_bcount: 512 tc_skip: 0

Not exactly, but I get similar errors.  After I replaced the disk drive
in my laptop, I started getting messages like this:

piixide0 at pci0 dev 31 function 1
piixide0: Intel 82801DBM IDE Controller (ICH4-M) (rev. 0x01)
piixide0: bus-master DMA support present
piixide0: primary channel wired to compatibility mode
piixide0: primary channel interrupting at irq 14
atabus0 at piixide0 channel 0
piixide0: secondary channel wired to compatibility mode
piixide0: secondary channel interrupting at irq 15
atabus1 at piixide0 channel 1
[...]
wd0 at atabus0 drive 0: <FUJITSU MHT2060AT>
wd0: drive supports 16-sector PIO transfers, LBA addressing
wd0: 57231 MB, 116280 cyl, 16 head, 63 sec, 512 bytes/sect x 117210240 sectors
rnd: wd0 attached as an entropy source (collecting and estimating)
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd0(piixide0:0:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA data transfers)

[init runs /etc/rc.  Heavily customised /etc/rc does various
mount/cgdconfig/fsck operations.]

piixide0:0: lost interrupt
        type: ata tc_bcount: 0 tc_skip: 0
piixide0:0: lost interrupt
        type: ata tc_bcount: 16384 tc_skip: 0
piixide0:0:0: intr with DRQ (st=0x58)
wd0e: device timeout writing fsbn 745600 of 745600-745631 (wd0 bn 35682930; cn 17423 tn 19 sn 18), retrying
wd0: soft error (corrected)

I seems to happen exactly once per boot, around the time that the
relevant partition is accessed by cgdconfig/fsck/mount.  After the
error, everything works fine.  (In case it makes a difference, wd0e is
the backing store for cgd1, cgd1 does not have a disklabel, and cgd1d
is an FFS file system.)  I am not sure whether the error always happens
with the same sectors or at exactly the same time.  The old disk did not
do this, and there were no kernel changes at the same time as the disk
change.

--apb (Alan Barrett)