Subject: Re: Dealing with bad blocks on fixed disks
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: Jason Thorpe <thorpej@wasabisystems.com>
List: tech-kern
Date: 04/15/2003 13:55:10
On Tuesday, April 15, 2003, at 01:44  PM, Manuel Bouyer wrote:

There's also another bug in this patch:

> On Sun, Apr 13, 2003 at 10:49:01AM +1000, Darren Reed wrote:
>> Index: dev/ata/wd.c
>> ===================================================================
>> RCS file: /cvsroot/src/sys/dev/ata/wd.c,v
>> retrieving revision 1.240
>> diff -c -r1.240 wd.c
>> *** dev/ata/wd.c	2003/04/03 22:18:23	1.240
>> --- dev/ata/wd.c	2003/04/13 00:29:08
>> ***************
>> *** 288,293 ****
>> --- 288,294 ----
>>   #else
>>   	bufq_alloc(&wd->sc_q, BUFQ_DISKSORT|BUFQ_SORT_RAWBLOCK);
>>   #endif
>> + 	SLIST_INIT(&wd->sc_bslist);
>>
>>   	wd->atabus = adev->adev_bustype;
>>   	wd->openings = adev->adev_openings;
>> ***************
>> *** 423,428 ****
>> --- 424,436 ----
>>   	struct buf *bp;
>>   	int s, bmaj, cmaj, i, mn;
>>
>> + 	/* Clean out the bad sector list */
>> + 	while (!SLIST_EMPTY(&sc->sc_bslist)) {
>> + 		void *head = SLIST_FIRST(&sc->sc_bslist);
>> + 		SLIST_REMOVE_HEAD(&sc->sc_bslist, dbs_next);
>> + 		free(head, M_TEMP);
>> + 	}
>> +
>>   	/* locate the major number */
>>   	bmaj = bdevsw_lookup_major(&wd_bdevsw);
>>   	cmaj = cdevsw_lookup_major(&wd_cdevsw);
>> ***************
>> *** 525,530 ****
>> --- 533,555 ----
>>
>>   	bp->b_rawblkno = blkno;
>>
>> + 	/*
>> + 	 * If the transfer about to be attempted contains only a block that
>> + 	 * is known to be bad then return an error for the transfer without
>> + 	 * even attempting to start a transfer up under the premis that we
>> + 	 * will just end up doing more retries for a transfer that will end
>> + 	 * up failing again.
>> + 	 * XXX:SMP - mutex required to protect with DIOCBSFLUSH
>> + 	 */
>> + 	if (__predict_false(!SLIST_EMPTY(&wd->sc_bslist))) {
>> + 		struct disk_badsectors *dbs;
>> + 		daddr_t maxblk = blkno + bp->b_bcount;

bp->b_bcount is a byte count, not a block count.  So, the calculation 
of maxblk is incorrect.

         -- Jason R. Thorpe <thorpej@wasabisystems.com>