tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: 10.0_BETA randomly panics on huge I/O on a ffs slice on ccd device



Michael van Elst a écrit :
> joel.kachelhoffer-bertrand%gmx.fr@localhost (=?UTF-8?Q?BERTRAND_Jo=c3=abl?=) writes:
> 
>> 	Hello,
> 
>> 	My main server runs 10.0_BETA (that fix a lot of iscsi initiator bugs).
>> [ 456386,139917] panic: ffs_blkfree: bad size: dev = 0xa805, bno =
> 
> that "just" means that the filesystem is damaged.

	Yes, I know ;-)

	But after first panic, I have done a complete fsck (followed by a newfs
as a lot of files went after fsck in lost+found).

>> 	This server runs squid on a /var/squi/cache, that is a wedge on a ccd
>> device :
> 
>> legendre# ccdconfig -g
>> ccd0            32      0x0     2000408739840   /dev/wd0a /dev/wd1a
> 
> About 1.8TB. Can you give details about wd0a and wd1a, in particular
> the sector size and number of sectors.

legendre# smartctl -a /dev/rwd0d
smartctl 7.3 2022-02-28 r5338 [NetBSD 10.0_BETA amd64] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda Pro Compute
Device Model:     ST1000LM049-2GH172
Serial Number:    WN92J53E
LU WWN Device Id: 5 000c50 0d090be5d
Firmware Version: SDM1
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      2.5 inches
TRIM Command:     Available
Device is:        In smartctl database 7.3/5319
ATA Version is:   ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Mar 19 18:20:30 2023 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
...
legendre# disklabel wd0
# /dev/rwd0d:
type: ESDI
disk: wd0
label: fictitious
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 16
sectors/cylinder: 1008
cylinders: 1938021
total sectors: 1953525168
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0           # microseconds
track-to-track seek: 0  # microseconds
drivedata: 0

4 partitions:
#        size    offset     fstype [fsize bsize cpg/sgs]
 a: 1953524160      1008        ccd                     # (Cyl.      1 -
1938020)
 d: 1953525168         0     unused      0     0        # (Cyl.      0 -
1938020)
legendre#

wd0 and wd1 are similar.

>> ** Phase 1 - Check Blocks and Sizes
>> PARTIALLY TRUNCATED INODE I=23352
>> SALVAGE? yes
> 
> That's a common thing after a crash as things have been partially
> written.
> 
> 
>> THE FOLLOWING SECTORS COULD NOT BE WRITTEN: 12513426624, 12513426625,
>> 12513426626, 12513426627, 12513426628, 12513426629, 12513426630,
>> 12513426631, 12513426632, 12513426633, 12513426634, 12513426635,
>> 12513426636, 12513426637, 12513426638, 12513426639, 12513426640,
>> 12513426641, 12513426642, 12513426643, 12513426644, 12513426645,
>> 12513426646, 12513426647, 12513426648, 12513426649, 12513426650,
>> 12513426651, 12513426652, 12513426653, 12513426654, 12513426655,
>> THE FOLLOWING DISK SECTORS COULD NOT BE READ: 11777852800, 11777852801,
>> 11777852802, 11777852803, 11777852804, 11777852805, 11777852806,
>> 11777852807, 11777852808, 11777852809, 11777852810, 11777852811,
>> 11777852812, 11777852813, 11777852814, 11777852815, 11777852816,
>> 11777852817, 11777852818, 11777852819, 11777852820, 11777852821,
>> 11777852822, 11777852823, 11777852824, 11777852825, 11777852826,
>> 11777852827, 11777852828, 11777852829, 11777852830, 11777852831,
>> CG 31239: ALLOCBLK: BAD MAGIC NUMBER
>> WRITING ZERO'ED BLOCK 11777852800 TO DISK
> 
> But that's completely out of range and corrupted. I doubt that
> fsck can really fix this.
> 
> You can try to dump the filesystem somewhere else, newfs and
> restore.
> 
> 

	This partition only contains squid cache. Thus, I have done a newfs
after first panic (last week).

	Regards,

	JKB


Home | Main Index | Thread Index | Old Index