Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Instability issues with NetBSD-9, xen-4.11 and the xbdb backend driver



	Hello.  thanks for the feedback.  One issue I have with your
supposition is that these same 5.2 domu vm's run without a flaw on Xen-4.12
with FreeBSD as the dom0 host.  I definitely think there is some ring
corruption going on, but if it was the domu that was the problem, it should
affect all versions of xen after a certain version.   I'll note that these
same vm's run for many hundreds of days  on xen-3.3.2.  Under
NetBSD/xen-4.8, things run fine until there is a significant load on the
domu in question.  That suggests some sort of race condition.  
And, yes, the number of vcpu's configured for these domu vm's is 1 per
vm.
	I'm not yet done testing with additional hardware and, based on other
information, it may be that I'm running into a hardware deffect.
If that turns out to be the case, I'll post here.
-thanks
-Brian

On Nov 8,  1:13am, =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= wrote:
} Subject: Re: Instability issues with NetBSD-9, xen-4.11 and the xbdb backe
} --00000000000054dcf40596caa76d
} Content-Type: text/plain; charset="UTF-8"
} Content-Transfer-Encoding: quoted-printable
} 
} Le jeu. 7 nov. 2019 =C3=A0 18:49, Brian Buhrow <buhrow%nfbcal.org@localhost> a =C3=A9=
} crit :
} 
} >         hello.  In pursuit of a solution to this issue, I tried installin=
} g
} > the
} > xenkernel48 and xentools48 packages to see if the problem exists there.
} > The problem is less pronounced, but still exists.  Again, here's what
} > happens:
} >
} > On the dom0 side we get errors like:
} >
} > [   354.670121] xbd backend domain 1 handle 0x1 (1) using event channel
} > 20, protocol x86_64-abi
} > [  2705.480121] xbdb1i1: unknown operation 21
} > [  2705.480121] xbd IO domain 1: error -1
} > [  2706.690055] xbd backend: detach device xendisks/viadev for domain 1
} > [  2706.750055] xvif1i0: disconnecting
} >
} >         On the domu side, we get:
} > panic: biodone2 already
} >
} >
} Do you have more than one VCPU assigned to domu by chance? If yes, please
} try to check if the problem happens also when you give just one VCPU to
} domu.
} 
} That 'unknown operation X' is just operation code, nothing over 6 seems to
} be defined in XEN sources, so that is already wrong. Maybe something
} disagrees on the size of the ring structure.
} 
} Besides this - NetBSD 5.2 guests are REALLY old. There were xbd fixes for
} domu since then for sure. Can you check with newer NetBSD version for the
} guests?
} 
} Jaromir
} 
} --00000000000054dcf40596caa76d
} Content-Type: text/html; charset="UTF-8"
} Content-Transfer-Encoding: quoted-printable
} 
} <div dir=3D"ltr"><div dir=3D"ltr"><br></div><div dir=3D"ltr">Le=C2=A0jeu. 7=
}  nov. 2019 =C3=A0=C2=A018:49, Brian Buhrow &lt;<a href=3D"mailto:buhrow@nfb=
} cal.org" target=3D"_blank">buhrow%nfbcal.org@localhost</a>&gt; a =C3=A9crit=C2=A0:<br=
} ></div><div dir=3D"ltr"><div class=3D"gmail_quote"><blockquote class=3D"gma=
} il_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-le=
} ft-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">=C2=A0 =
} =C2=A0 =C2=A0 =C2=A0 hello.=C2=A0 In pursuit of a solution to this issue, I=
}  tried installing the<br>
} xenkernel48 and xentools48 packages to see if the problem exists there.<br>
} The problem is less pronounced, but still exists.=C2=A0 Again, here&#39;s w=
} hat<br>
} happens:<br>
} <br>
} On the dom0 side we get errors like:<br>
} <br>
} [=C2=A0 =C2=A0354.670121] xbd backend domain 1 handle 0x1 (1) using event c=
} hannel 20, protocol x86_64-abi<br>
} [=C2=A0 2705.480121] xbdb1i1: unknown operation 21<br>
} [=C2=A0 2705.480121] xbd IO domain 1: error -1<br>
} [=C2=A0 2706.690055] xbd backend: detach device xendisks/viadev for domain =
} 1<br>
} [=C2=A0 2706.750055] xvif1i0: disconnecting<br>
} <br>
} =C2=A0 =C2=A0 =C2=A0 =C2=A0 On the domu side, we get:<br>
} panic: biodone2 already<br><br></blockquote><div><br></div><div>Do you have=
}  more than one VCPU assigned to domu by chance? If yes, please try to check=
}  if the problem happens also when you give just one VCPU to domu.</div><div=
} ><br></div><div>That &#39;unknown operation X&#39; is just operation code, =
} nothing over 6 seems to be defined in XEN sources, so that is already wrong=
} . Maybe something disagrees on the size of the ring structure.</div><div><b=
} r></div><div>Besides this - NetBSD 5.2 guests are REALLY old. There were xb=
} d fixes for domu since then for sure. Can you check with newer NetBSD versi=
} on for the guests?</div><div><br></div><div>Jaromir=C2=A0</div><div><br></d=
} iv><div>=C2=A0</div></div></div>
} </div>
} 
} --00000000000054dcf40596caa76d--
} 
>-- End of excerpt from =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?=




Home | Main Index | Thread Index | Old Index