Subject: Re: Dealing with bad blocks on fixed disks
To: Darren Reed <darrenr@reed.wattle.id.au>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: tech-kern
Date: 04/15/2003 22:44:55
Sorry to come on this so late, I just noticed a probable bug looking at
the b/bp typo

On Sun, Apr 13, 2003 at 10:49:01AM +1000, Darren Reed wrote:
> Index: dev/ata/wd.c
> ===================================================================
> RCS file: /cvsroot/src/sys/dev/ata/wd.c,v
> retrieving revision 1.240
> diff -c -r1.240 wd.c
> *** dev/ata/wd.c	2003/04/03 22:18:23	1.240
> --- dev/ata/wd.c	2003/04/13 00:29:08
> ***************
> *** 288,293 ****
> --- 288,294 ----
>   #else
>   	bufq_alloc(&wd->sc_q, BUFQ_DISKSORT|BUFQ_SORT_RAWBLOCK);
>   #endif
> + 	SLIST_INIT(&wd->sc_bslist);
>   
>   	wd->atabus = adev->adev_bustype;
>   	wd->openings = adev->adev_openings;
> ***************
> *** 423,428 ****
> --- 424,436 ----
>   	struct buf *bp;
>   	int s, bmaj, cmaj, i, mn;
>   
> + 	/* Clean out the bad sector list */
> + 	while (!SLIST_EMPTY(&sc->sc_bslist)) {
> + 		void *head = SLIST_FIRST(&sc->sc_bslist);
> + 		SLIST_REMOVE_HEAD(&sc->sc_bslist, dbs_next);
> + 		free(head, M_TEMP);
> + 	}
> + 
>   	/* locate the major number */
>   	bmaj = bdevsw_lookup_major(&wd_bdevsw);
>   	cmaj = cdevsw_lookup_major(&wd_cdevsw);
> ***************
> *** 525,530 ****
> --- 533,555 ----
>   
>   	bp->b_rawblkno = blkno;
>   
> + 	/*
> + 	 * If the transfer about to be attempted contains only a block that
> + 	 * is known to be bad then return an error for the transfer without
> + 	 * even attempting to start a transfer up under the premis that we
> + 	 * will just end up doing more retries for a transfer that will end
> + 	 * up failing again.
> + 	 * XXX:SMP - mutex required to protect with DIOCBSFLUSH
> + 	 */
> + 	if (__predict_false(!SLIST_EMPTY(&wd->sc_bslist))) {
> + 		struct disk_badsectors *dbs;
> + 		daddr_t maxblk = blkno + bp->b_bcount;
> + 
> + 		SLIST_FOREACH(dbs, &wd->sc_bslist, dbs_next)
> + 			if (dbs->dbs_min >= blkno && dbs->dbs_max < maxblk)
> + 				goto bad;
> + 	}
> + 

You probably want to add
	bp->b_error = EIO;

before the 'goto bad' here. Otherwise I guess the error would be undefined.


-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 24 ans d'experience feront toujours la difference
--