Subject: kern/9857: wddone() omits block numbers from soft errors
To: None <gnats-bugs@gnats.netbsd.org>
From: None <jhawk@MIT.EDU>
List: netbsd-bugs
Date: 04/10/2000 16:02:15
>Number: 9857
>Category: kern
>Synopsis: wddone() omits block numbers from soft errors
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Apr 10 16:03:00 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator: John Hawkinson
>Release: NetBSD 1.4.2
>Organization:
>Environment:
>Description:
wddone() omits block numbers from soft errors. For hard errors,
diskerr() is called and a block number is reported. For soft errors,
a simple printf() happens and the block number is not included.
I presume that blkdone is also available in the NOERROR case.
>How-To-Repeat:
Inspect the code:
case ERROR:
/* Don't care about media change bits */
if (wd->sc_wdc_bio.r_error != 0 &&
(wd->sc_wdc_bio.r_error & ~(WDCE_MC | WDCE_MCR)) == 0)
goto noerror;
ata_perror(wd->drvp, wd->sc_wdc_bio.r_error, errbuf);
retry: /* Just reset and retry. Can we do more ? */
wdc_reset_channel(wd->drvp);
diskerr(bp, "wd", errbuf, LOG_PRINTF,
wd->sc_wdc_bio.blkdone, wd->sc_dk.dk_label);
if (wd->retries++ < WDIORETRIES) {
printf(", retrying\n");
timeout(wdrestart, wd, RECOVERYTIME);
return;
}
printf("\n");
bp->b_flags |= B_ERROR;
bp->b_error = EIO;
break;
case NOERROR:
noerror: if ((wd->sc_wdc_bio.flags & ATA_CORR) || wd->retries > 0)
printf("%s: soft error (corrected)\n",
wd->sc_dev.dv_xname);
}
disk_unbusy(&wd->sc_dk, (bp->b_bcount - bp->b_resid));
>Fix:
Presumably diskerr() should be called for the NOERR case, as well.
Especially if someone gets around to modifying diskerr() to centrally collect
statistics on disk errors (ala iostat -E under Solaris). I can't help but
wondering if there is some reason it wasn't done this way?
>Release-Note:
>Audit-Trail:
>Unformatted: