Subject: Re: disklabel bug?
To: None <port-i386@netbsd.org>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: port-i386
Date: 01/23/2003 18:21:42
>> On investigation, it turns out that in readdisklabel
>> (i386/disksubr.c), lp->d_secpercyl was zero.  This appears to be due
>> to the way wdgetdisklabel calls readdisklabel twice, with the second
>> call blindly assuming the first call's values are suitable.  [...]

> Yes, probably.  I've been in touch with a user which was probably hit
> by this bug, but I couldn't get enouth data to track it down.

On reading code more, it looks as though it can happen only when the
disklabel claims bad-sector lists are present, because that's the only
way it can bash d_secpercyl and then return failure.  But if the label
gets scribbled over, this is not implausible.

I have a sample image: a 16MB file such that if I dd it onto the
beginning of an IDE drive and boot a kernel without any tweaks to the
disk-label stuff, I get the crash.  (The first crash, the one in
readdisklabel.  I have a fix for that one in my tree, but haven't yet
constructed an image that will make it crash at the second point I saw,
the one that struck while servicing a completion interrupt.)

The 16MB is almost all NULs and thus compresses spectacularly well.
Below is a uuencoding of bzip2 -9 output for it.  To use this, dd it
onto the beginning of an IDE drive and then reboot (i386 only).  Be
careful to use RAW_PART when writing (eg, dd of=/dev/wdNd for
appropriate N), and also be sure you have a way to recover, like a
non-i386 IDE-capable machine, since without a fixed kernel you won't be
able to boot with that disk connected.  Also, this will bash the
partitioning info on the disk and the beginning of the data, so don't
use a drive whose contents you care about.

I'm holding off submitting the PR because I'll try to construct an
image that provokes the second panic, if the first one is fixed up.  I
added four lines to my i386 readdisklabel:

        /* XXX would it be better to return failure?
           Leaving secpercyl zero guarantees a divide-by-zero trap below. */
        if (lp->d_secpercyl == 0)
                lp->d_secpercyl = 63 * 16;

which makes the first panic go away.  In the incident that started this
off, I saw the second panic.  I haven't made the second one happen
deliberately yet.

begin 644 bugimage1.bz2
M0EIH.3%!62936?4'MB2`?CEWP/6EP`%`H4(``8`W!0<`4```&&`.GSY40J%%
M)%!4*4JH$B@JA$(2^VB5*$H1C3$81I@``#&F(PC3```&-,1A&F```$E-0```
M:!H":E51@`````I*E3$R,$PC`C:GKT7'].;T[.GET6047+44MD"MHJ6U%L6R
M)1M*4-@;"1;*$MJJ%;06R*-HJ;(K9439*JV@&T5;05LJ+8J&RHFQ)LB5L0;)
M-M96S83:K:&Q-HFU4;"BVILK8-A38+:3:!-I2;0;%;%&T1M5;(ELBVE#:)+:
ME;!);$;%%L$M@K:I1LI5L`;4I;2D6PBMD2V1LE4MB4VHEL2MD%;2FRBV4C8"
MME6R$V%%;$38I5L@VJ*MD2M@";*J3:*&P2;*H6PMATZJAQJBMD4FT$&T4&T"
M394BVD)L4DV0%M*IL5"VDHV"DV5+:DHMBI*V`BV"B-DHV5$V1;5%-A`;2D;)
M5*;2I2VH0;`JC:J);2(+8(MD-J%2V$)5WZ@JYAY^/#=^XE"#\Z2I;`4;1!;(
M2ME-JH#:52V**-A2;$JK8)6TK:5(VE#92V4-E-9M(MK9)L+:6R3:1LJV&R&P
M6TE;2"V4MB5&R(VB;(-H0V@#:2FR*VB%;"FU"K8*MHV4&U2&Q1L`VD*VHBME
M538V`VJ+:")L2*;(ILA3:"K:)6TI%M04VBJVH";2%-H*VH0VBE;`HV*JV"&R
M"6RH4V21M)%M*#9!;4`;%4FRBMB"2VE4FQ-J4*MHD3:I56RA;"1L))LJ!L`M
MJJJC:5-BDEL)2VJ*6PI2V$MDH39*I4YZ!E!U=O!]LD(X,"-B*MI&T"2V)&RI
M&P%;2IM$V4HFTE&Q26R0FU20V*BV@JV!)LJE;*)L$K:H;5-DDMD-HH;!;"EL
MBVMELK:4;4EM*MBMB6T5;0FR5L%M$M@6Q-J(FS:J6P4VD6P5M2JV4-DI;$);
M*4;(IM2;0I;%*;";0V(EM(FT2;%6R#:%3:H+95L"+80;20V4#:4%;)5;%"-A
M);4*FPDM@EM0IL4HV)3:B-B4V51M*%LALA6U%;(*K9$%M4J&T0K94@MBFU))
M-J%$VA2M@*;20C:J1LFR@V`EL52FT2&T&Q)1L*EM);*E&R%&U`&R2-BI&Q%)
MM!%;`2VE3:IL550XBK(1'&A&TI%M(6T06U!6TD)LA;4J@VI5;!)+9))L4&U(
M;1&P5L16T5;4J-E;`K8)M6TMI&TH;16P;&UM!;"V0VK8BV(MJJ;1)L6R%LB;
M$MJV16U$K8H;%;)&Q38FQ)M%);$EM1M(39%;!5M0MH`V*-E*MB1L2V*6P&U)
MLDVJ2FTA-I2V*39%+916RC9*6U0K8(K8*;238BDV)5-DJ#:";%56PI-J`;%#
M82&U52MI46T2&P(;15;1#:A6Q2)M`FT4FQ0FU5!M4AL2FR1+9*5L*IL0V;*@
MFU12;`4ME25L*1;%"JVD2;2HELD2VJBFT"VHDVH#8(AL2EL2V"HVJ%&U1(MH
MA;15M%)M)$V%&Q)5+M_\0)A"21AHP8@X^O8SB90<:NX'C/?YU"[GR#T^N7XZ
MNN<F41/7)1L*JC94D;*(M@JE;!53:*MJ`MJ%3:FR-B5&T$VE!LE;1$V"FT+:
M$FQ1LMFT6TK9;4AM15LC:$VE5L%6PK8JK:I;*V1;)1;1"V`FR%;)1M0394-D
MDV&R+:%;2)L25;"E-E);!%M1&U*6TJ-H;2DV1&T;"ELH39)-JDVJ5;(*VH39
M0V2&S8A5M)-@EL2BFP5LDI;!+:"3:*ILI)L42VE0;"2ME";$%M$3:24VI!-J
M@V%&U5$6RJ+:D2;`JV1#:H;$1;0;*@VH0-DD39$K925;%46PEL)2VD380+:%
M+:+9;!6PMI(6T44MJA6P&U1$VI1;"5M0MI*)M*V)5&R%+:@2/^+N2*<*$AZ@
#]L2`
`
end

/~\ The ASCII				der Mouse
\ / Ribbon Campaign
 X  Against HTML	       mouse@rodents.montreal.qc.ca
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B