Subject: Re: know bad sector => obtain bad file
To: Denis Lagno <dlagno@mail.ru>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: netbsd-users
Date: 03/10/2005 21:19:41
On Thu, Mar 10, 2005 at 10:04:38PM +0300, Denis Lagno wrote:
> hmm, it seems to be very weird error.  After it occured I read the whole
> partition with dd if=/dev/wd0g of=/dev/null bs=1m
> and I encountered no error.  Then I tried to redo the dump.
> And it failed again at the same place:
> 
>   DUMP: readBlocks: read fails: Input/output error
>   DUMP: read error from /dev/rcgd3h: Input/output error: [block 183235392]: count=10240
>   DUMP: read error from /dev/rcgd3h: Input/output error: [sector 183235392]: count=512
>   DUMP: readBlocks: read fails: Input/output error
>   DUMP: readBlocks: read fails: Input/output error
>   DUMP: readBlocks: read fails: Input/output error
>   DUMP: readBlocks: read fails: Input/output error
> 
> Mar  9 23:53:03 flam /netbsd: wd0g: error reading fsbn 237235392 of 237235392-237235455 (wd0 bn 268435455; cn 266305 tn 0 sn 15), retrying
> Mar  9 23:53:03 flam /netbsd: wd0: (id not found)
> Mar  9 23:53:04 flam /netbsd: wd0g: error reading fsbn 237235392 of 237235392-237235455 (wd0 bn 268435455; cn 266305 tn 0 sn 15)wd0: (id not found)
> Mar  9 23:53:04 flam /netbsd:
> Mar  9 23:53:04 flam /netbsd: cgd3: error 5
> 
> Then I again try to reproduce it with dd and see no errors:
> 
> # dd if=/dev/rcgd3h of=/dev/null seek=183235392 bs=512 count=100
> 100+0 records in
> 100+0 records out
> 51200 bytes transferred in 0.047 secs (1089361 bytes/sec)

I'm not sure in which units dump prints the block numbers. 

> 
> # dd if=/dev/wd0g of=/dev/null seek=237235392 bs=512 count=100
> 100+0 records in
> 100+0 records out
> 51200 bytes transferred in 0.024 secs (2133333 bytes/sec)

Did you try with the raw device ? And maybe from /dev/rwd0d,
with seek=268435455

> Wonder, how can it be so..

Usually a drive reports ID not found when trying to read past EOM (but as
I said, it uses  ID not found for other failures too). It could also be
a transient failure, occuring only when the drive has been busy for
some time (for example because of heat).

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--