Subject: Re: ESP SCSI controller errors?
To: None <eeh@netbsd.org>
From: Havard Eidnes <he@netbsd.org>
List: port-sparc
Date: 01/27/2002 20:45:48
> Given that, it sounds like this is "drive rot" rather than "driver ro=
t",
> but it would be much more interesting to learn what status the drive
> is returning in this circumstance. It may be possible to work around=
> it in the driver.
Hm, OK. We may have to put the "drive upgrade" plan into action.
However, if you want to have us test a driver tweak to get something
more intelligible out of the driver I'd be interested in doing that.
> So exactly how long ago did this problem start? And what's the date
> of the kernel sources?
The date of the kernel sources we run at the moment is January 17 2002.=
As to when it started occurring, it's a little more difficult to tell.
Our rtty setup only keeps a month's worth of console logs. The first
entry I find is from Jan 8, which says:
esp0: !TC on DATA XFER [intr 10, stat 83, step 4] prevphase 1, resid 3c=
00
esp0: !TC on DATA XFER [intr 10, stat 83, step 4] prevphase 1, resid 40=
0
trap type 0x7: pc=3D0xf01b3484 npc=3D0xf01b3488 psr=3D110000c0<S,PS>
kernel: alignment fault trap
Stopped in pid 17254 (find) at ufs_lookup+0x2dc: lduh =
[%l0 + 0x4], %o2
db>
At the time it was running a kernel from Jan 6 sources.
On a later boot, it said:
esp0: !TC on DATA XFER [intr 10, stat 83, step 4] prevphase 1, resid 20=
00
bad block -939524096, ino 384384
bad block -2013259840, ino 384384
Jan 9 04:03:04 anker /netbsd: uid 16777216 comm find on /home: bad blo=
ck
bad block -922746880, ino 384385
bad block 1744836527, ino 384385
bad block -905969664, ino 384386
bad block 1073747887, ino 384386
Jan 9 04:03:35 anker last message repeated 4 times
bad block -889192448, ino 384387
bad block -1879042129, ino 384387
bad block -872415232, ino 384388
bad block -1342171217, ino 384388
bad block -855638016, ino 384389
bad block 134223791, ino 384389
Jan 9 04:04:08 anker last message repeated 7 times
(didn't crash on that run), and on yet a later boot it said
esp0: !TC on DATA XFER [intr 10, stat 83, step 4] prevphase 1, resid 40=
0
trap type 0x7: pc=3D0xf01b3484 npc=3D0xf01b3488 psr=3D110000c1<S,PS>
kernel: alignment fault trap
Stopped in pid 2326 (find) at ufs_lookup+0x2dc: lduh =
[%l0 + =
0x4], %o2
db>
It was after the latter I ended up with
PARTIALLY ALLOCATED INODE I=3D384390
/dev/rsd1a: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY.
and a number of files and directories in lost+found.
Regards,
- H=E5vard