Netra X1 hardware flakiness

[- This continues a thread in port-sparc64, where I am having problems with a Netra X1, but has now fallen to an IDE disk question, so I'm adding current-users... -]

On May 9, 2010, at 11:25, Chris Ross wrote:
That's my guess as well. I'm guessing either RAM or disk are having issues. Given the lack of disk errors, it may be RAM. I hate trying to track bad RAM, but if I can get it failing consistently enough, even if in different ways each time, just trial and error will work it out in short enough order, I guess.

On this "flaky hardware" front, while chasing down some oddities not letting me delete directories that it said weren't empty, even with rm -rf as root, I saw the following crash on my Netra X1. The drive in question is:

wd0 at atabus0 drive 0: <ST340824A>
wd0: drive supports 16-sector PIO transfers, LBA addressing
wd0: 38166 MB, 77545 cyl, 16 head, 63 sec, 512 bytes/sect x 78165360 sectors
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd0(aceride0:0:0): using PIO mode 4, Ultra-DMA mode 4 (Ultra/66) (using DMA)

Looking at the crash below, clearly, disk is an issue and worthy of analysis. Not being much of an IDE disk person, is it possible to effectively "format" an IDE disk, to reinitialize the bad sector list? This is oft what I would do to determine if a SCSI drive is sufficient to reuse. Is there any possibility of this with an IDE disk drive?

                                  - Chris

# rm -rf audio/
rm: audio/gtick/patches/CVS: Directory not empty
rm: audio/gtick/patches: Directory not empty
rm: audio/gtick: Directory not empty
rm: audio: Directory not empty
# cd audio/gtick/patches/CVS/
# ls -la
bad block 1188950361018163200, ino 54368
bad block 1111680576, ino 54368
bad block 1111680704, ino 54368
bad block 1111680960, ino 54368
bad block 1111681088, ino 54368
bad block 1111682112, ino 54368
bad block 1102323904, ino 54368
bad block 1111731392, ino 54368
bad block 1112788248, ino 54368
bad block 1112787960, ino 54368
bad block 1112788104, ino 54368
dev = 0xc07, bno = 545460846593 bsize = 16384, size = 16384, fs = /data
panic: blkfree: bad size
Begin traceback...
End traceback...
Frame pointer is at 0xcce6a51
Call traceback:
14680f0(11, 5, 0, 0, 1859400, 0, cce6b21) fp = cce6b21
132a860(104, 0, ffff, 16c49b1, 132a5a0, 0, cce6be1) fp = cce6be1
126f4c0(167dee0, c07, 7f00000001, 4000, 4000, 104, cce6cb1) fp = cce6cb1
1274730(2803000, b9b65f0, 7f00000001, 4000, d460, cce7650, cce6d81) fp = cce6d81 1275654(800, ffffffffffed17f4, 7f00000001, ffffffffffffffff, 1, 140, cce6e71) fp = cce6e71
12a0798(16, 0, 1, 1, cce77f0, 0, cce7001) fp = cce7001
1371138(0, 12, cd09400, 1, ffffffffffffffff, 0, cce70c1) fp = cce70c1
1364950(10113270, cce7a5f, cc53610, 0, 0, e0018000, cce71a1) fp = cce71a1
1367884(10113270, 0, cce7b78, cce7b28, 1, 0, cce7271) fp = cce7271
13678b0(9, 14, cce7c78, ffffffffffffb548, 4020a400, 0, cce73c1) fp = cce73c1 146f698(cd09400, cce7dc0, cce7e00, 1, 409213e0, 1820eb8, cce7511) fp = cce7511 1008c68(cce7ed0, cce7f50, 40740698, 4074069c, 0, cce7dc0, cce7621) fp = cce7621 407cf6f4(40a0c1b8, 40a0c1c0, 0, fefefefefefefeff, ffffffffffffffff, 0, ffffffffffffaee1) fp = ffffffffffffaee1

dumping to dev 12,1 offset 737535
   227 M wddump: DMA error
- device not ready

