Subject: Re: Dealing with bad blocks on fixed disks
To: Jason Thorpe <thorpej@wasabisystems.com>
From: Darren Reed <darrenr@reed.wattle.id.au>
List: tech-kern
Date: 04/16/2003 10:04:43
In some email I received from Jason Thorpe, sie wrote:
> 
> bp->b_bcount is a byte count, not a block count.  So, the calculation 
> of maxblk is incorrect.

Yes, I found that out and corrected it.

My last remaining thoughts on this "feature" are that it would be nice
if in cases where an error has been experienced for b_bcount > DEV_BSIZE
for the kernel to be able to schedule reading each of the blocks itself
and use the feedback from that to populate the "bad sector" list.  If
this were possible, then the list would only ever contain known problematic
blocks, without any incrimination of others within the same range being
fetched.

To give an example of this, presently, if it tries to read, say 100 blocks
and #42 is bad, if the I/O operation for the buf struct was that large
then all 100 are considered "bad".  Instead I'd like the kernel to retry
the operation for each one of the 100 blocks separately, before putting
the block on the "bad sector list".

The present "workaround" for this is to do:

dkctl /dev/rwd0a badsector retry

as this retries each listed sector, individually.  e.g.:

# dkctl /dev/wd1d badsector list
/dev/wd1d: blocks 954399 - 954510 failed at Wed Apr 16 10:00:38 2003
/dev/wd1d: blocks 26637216 - 26637216 failed at Wed Apr 16 05:08:19 2003
/dev/wd1d: blocks 21347298 - 21347298 failed at Wed Apr 16 05:08:06 2003
/dev/wd1d: blocks 21347297 - 21347297 failed at Wed Apr 16 05:07:54 2003
# /usr/src/sbin/dkctl/dkctl /dev/rwd1d badsector retry
/dev/rwd1d: bad sector clusters 4 total sectors 115
/dev/rwd1d: bad sectors flushed
/dev/rwd1d: Retrying 21347297 - 21347297
/dev/rwd1d: block 21347297 - failed
/dev/rwd1d: Retrying 21347298 - 21347298
/dev/rwd1d: block 21347298 - failed
/dev/rwd1d: Retrying 26637216 - 26637216
/dev/rwd1d: block 26637216 - failed
/dev/rwd1d: Retrying 954399 - 954510
/dev/rwd1d: block 954399 - ok
/dev/rwd1d: block 954400 - ok
/dev/rwd1d: block 954401 - ok
/dev/rwd1d: block 954402 - ok
/dev/rwd1d: block 954403 - ok
/dev/rwd1d: block 954404 - ok
/dev/rwd1d: block 954405 - ok
/dev/rwd1d: block 954406 - ok
/dev/rwd1d: block 954407 - ok
/dev/rwd1d: block 954408 - ok
/dev/rwd1d: block 954409 - ok
/dev/rwd1d: block 954410 - ok
/dev/rwd1d: block 954411 - ok
/dev/rwd1d: block 954412 - ok
/dev/rwd1d: block 954413 - ok
/dev/rwd1d: block 954414 - ok
/dev/rwd1d: block 954415 - ok
/dev/rwd1d: block 954416 - ok
/dev/rwd1d: block 954417 - ok
/dev/rwd1d: block 954418 - ok
/dev/rwd1d: block 954419 - ok
/dev/rwd1d: block 954420 - ok
/dev/rwd1d: block 954421 - ok
/dev/rwd1d: block 954422 - ok
/dev/rwd1d: block 954423 - ok
/dev/rwd1d: block 954424 - ok
/dev/rwd1d: block 954425 - ok
/dev/rwd1d: block 954426 - ok
/dev/rwd1d: block 954427 - ok
/dev/rwd1d: block 954428 - ok
/dev/rwd1d: block 954429 - ok
/dev/rwd1d: block 954430 - ok
/dev/rwd1d: block 954431 - ok
/dev/rwd1d: block 954432 - ok
/dev/rwd1d: block 954433 - ok
/dev/rwd1d: block 954434 - ok
/dev/rwd1d: block 954435 - ok
/dev/rwd1d: block 954436 - ok
/dev/rwd1d: block 954437 - ok
/dev/rwd1d: block 954438 - ok
/dev/rwd1d: block 954439 - ok
/dev/rwd1d: block 954440 - ok
/dev/rwd1d: block 954441 - ok
/dev/rwd1d: block 954442 - ok
/dev/rwd1d: block 954443 - ok
/dev/rwd1d: block 954444 - ok
/dev/rwd1d: block 954445 - ok
/dev/rwd1d: block 954446 - ok
/dev/rwd1d: block 954447 - ok
/dev/rwd1d: block 954448 - ok
/dev/rwd1d: block 954449 - ok
/dev/rwd1d: block 954450 - ok
/dev/rwd1d: block 954451 - ok
/dev/rwd1d: block 954452 - ok
/dev/rwd1d: block 954453 - ok
/dev/rwd1d: block 954454 - ok
/dev/rwd1d: block 954455 - ok
/dev/rwd1d: block 954456 - ok
/dev/rwd1d: block 954457 - ok
/dev/rwd1d: block 954458 - ok
/dev/rwd1d: block 954459 - ok
/dev/rwd1d: block 954460 - ok
/dev/rwd1d: block 954461 - ok
/dev/rwd1d: block 954462 - ok
/dev/rwd1d: block 954463 - failed
/dev/rwd1d: block 954464 - ok
/dev/rwd1d: block 954465 - ok
/dev/rwd1d: block 954466 - ok
/dev/rwd1d: block 954467 - ok
/dev/rwd1d: block 954468 - ok
/dev/rwd1d: block 954469 - ok
/dev/rwd1d: block 954470 - ok
/dev/rwd1d: block 954471 - ok
/dev/rwd1d: block 954472 - ok
/dev/rwd1d: block 954473 - ok
/dev/rwd1d: block 954474 - ok
/dev/rwd1d: block 954475 - ok
/dev/rwd1d: block 954476 - ok
/dev/rwd1d: block 954477 - ok
/dev/rwd1d: block 954478 - ok
/dev/rwd1d: block 954479 - ok
/dev/rwd1d: block 954480 - ok
/dev/rwd1d: block 954481 - ok
/dev/rwd1d: block 954482 - ok
/dev/rwd1d: block 954483 - ok
/dev/rwd1d: block 954484 - ok
/dev/rwd1d: block 954485 - ok
/dev/rwd1d: block 954486 - ok
/dev/rwd1d: block 954487 - ok
/dev/rwd1d: block 954488 - ok
/dev/rwd1d: block 954489 - ok
/dev/rwd1d: block 954490 - ok
/dev/rwd1d: block 954491 - ok
/dev/rwd1d: block 954492 - ok
/dev/rwd1d: block 954493 - ok
/dev/rwd1d: block 954494 - ok
/dev/rwd1d: block 954495 - ok
/dev/rwd1d: block 954496 - ok
/dev/rwd1d: block 954497 - ok
/dev/rwd1d: block 954498 - ok
/dev/rwd1d: block 954499 - ok
/dev/rwd1d: block 954500 - ok
/dev/rwd1d: block 954501 - ok
/dev/rwd1d: block 954502 - ok
/dev/rwd1d: block 954503 - ok
/dev/rwd1d: block 954504 - ok
/dev/rwd1d: block 954505 - ok
/dev/rwd1d: block 954506 - ok
/dev/rwd1d: block 954507 - ok
/dev/rwd1d: block 954508 - ok
/dev/rwd1d: block 954509 - ok
/dev/rwd1d: block 954510 - ok
# /usr/src/sbin/dkctl/dkctl /dev/wd1d badsector list
/dev/wd1d: blocks 954463 - 954463 failed at Wed Apr 16 10:02:17 2003
/dev/wd1d: blocks 26637216 - 26637216 failed at Wed Apr 16 10:02:04 2003
/dev/wd1d: blocks 21347298 - 21347298 failed at Wed Apr 16 10:01:52 2003
/dev/wd1d: blocks 21347297 - 21347297 failed at Wed Apr 16 10:01:40 2003

Darren