NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/56050 (xhci suspend/resume is unimplemented)



The following reply was made to PR kern/56050; it has been noted by GNATS.

From: Andrius V <vezhlys%gmail.com@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost, 
	nia%pkgsrc.org@localhost, Taylor R Campbell <riastradh%netbsd.org@localhost>
Subject: Re: kern/56050 (xhci suspend/resume is unimplemented)
Date: Thu, 27 May 2021 01:07:55 +0300

 --0000000000008cf4f505c342e006
 Content-Type: text/plain; charset="UTF-8"
 
 The main suspicion came for me on while loop condition changes starting
 line 3198:
     while (sc->sc_command_addr != 0 &&
         sc->sc_suspender != NULL &&
         sc->sc_suspender != curlwp)
 
 And it seems to be actually the culprit for the issue now: seemingly it
 goes into infinite loop and likely locks the device. Because of that
 usbd_delay_ms() times out without "getting" new address and assertion fails
 on 2901 line (as per my backtrace). I didn't have time today to check what
 are the values, but removing newly added sc->sc_suspender checks makes the
 system boot successfully again.
 
 
 On Wed, May 26, 2021 at 1:59 PM Andrius V <vezhlys%gmail.com@localhost> wrote:
 >
 > No, it happens consistently on every boot right after that commit. I
 > tested kernels commit by commit, all previous ones boot without
 > issues. Yes, I don't suspend/resume, not sure why it affects booting
 > process.
 >
 > On Wed, May 26, 2021 at 1:45 PM <maya%netbsd.org@localhost> wrote:
 > >
 > > The following reply was made to PR kern/56050; it has been noted by
 GNATS.
 > >
 > > From: maya%NetBSD.org@localhost
 > > To: gnats-bugs%netbsd.org@localhost
 > > Cc: netbsd-bugs%netbsd.org@localhost, nia%pkgsrc.org@localhost
 > > Subject: Re: kern/56050 (xhci suspend/resume is unimplemented)
 > > Date: Wed, 26 May 2021 10:41:47 +0000
 > >
 > >  On Wed, May 26, 2021 at 09:15:02AM +0000, Andrius V wrote:
 > >  > The following reply was made to PR kern/56050; it has been noted by
 GNATS.
 > >  >
 > >  > From: Andrius V <vezhlys%gmail.com@localhost>
 > >  > To: gnats-bugs%netbsd.org@localhost
 > >  > Cc: kern-bug-people%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost,
 gnats-admin%netbsd.org@localhost,
 > >  >      nia%netbsd.org@localhost, Taylor R Campbell <riastradh%netbsd.org@localhost>
 > >  > Subject: Re: kern/56050 (xhci suspend/resume is unimplemented)
 > >  > Date: Wed, 26 May 2021 12:09:45 +0300
 > >  >
 > >  >  The latest xhci.c rev 1.140 changes are causing panic in my system
 if
 > >  >  any xhci device (or at least with my USB stick and/or external SSD)
 is
 > >  >  connected. The previous commit still works (suspend/resume draft).
 > >  >  Interestingly enough and probably irrelevant, the kernel does not
 > >  >  crash with serial console boot on 115200 speed (still fails on
 default
 > >  >  9600 speed though).
 > >  >
 > >
 > >  xhci.c 1.140 only touches suspend/resume code, and you don't seem to be
 > >  suspending in this dmesg.
 > >
 > >  I suspect this is a spurious panic that happens sometimes.
 > >
 
 --0000000000008cf4f505c342e006
 Content-Type: text/html; charset="UTF-8"
 Content-Transfer-Encoding: quoted-printable
 
 <div dir=3D"auto">The main suspicion came for me on while loop condition ch=
 anges starting line 3198:<br>
 =C2=A0 =C2=A0 while (sc-&gt;sc_command_addr !=3D 0 &amp;&amp;<br>
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 sc-&gt;sc_suspender !=3D NULL &amp;&amp;<br>
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 sc-&gt;sc_suspender !=3D curlwp)<br>
 <br>
 And it seems to be actually the culprit for the issue now: seemingly it goe=
 s into infinite loop and likely locks the device. Because of that usbd_dela=
 y_ms() times out without &quot;getting&quot; new address and assertion fail=
 s on 2901 line (as per my backtrace). I didn&#39;t have time today to check=
  what are the values, but removing newly added sc-&gt;sc_suspender checks m=
 akes the system boot successfully again.</div><br>
 <br>
 On Wed, May 26, 2021 at 1:59 PM Andrius V &lt;<a href=3D"mailto:vezhlys@gma=
 il.com" target=3D"_blank" rel=3D"noreferrer">vezhlys%gmail.com@localhost</a>&gt; wrot=
 e:<br>
 &gt;<br>
 &gt; No, it happens consistently on every boot right after that commit. I<b=
 r>
 &gt; tested kernels commit by commit, all previous ones boot without<br>
 &gt; issues. Yes, I don&#39;t suspend/resume, not sure why it affects booti=
 ng<br>
 &gt; process.<br>
 &gt;<br>
 &gt; On Wed, May 26, 2021 at 1:45 PM &lt;<a href=3D"mailto:maya%netbsd.org@localhost"=
  target=3D"_blank" rel=3D"noreferrer">maya%netbsd.org@localhost</a>&gt; wrote:<br>
 &gt; &gt;<br>
 &gt; &gt; The following reply was made to PR kern/56050; it has been noted =
 by GNATS.<br>
 &gt; &gt;<br>
 &gt; &gt; From: maya%NetBSD.org@localhost<br>
 &gt; &gt; To: <a href=3D"mailto:gnats-bugs%netbsd.org@localhost"; target=3D"_blank" re=
 l=3D"noreferrer">gnats-bugs%netbsd.org@localhost</a><br>
 &gt; &gt; Cc: <a href=3D"mailto:netbsd-bugs%netbsd.org@localhost"; target=3D"_blank" r=
 el=3D"noreferrer">netbsd-bugs%netbsd.org@localhost</a>, <a href=3D"mailto:nia@pkgsrc.=
 org" target=3D"_blank" rel=3D"noreferrer">nia%pkgsrc.org@localhost</a><br>
 &gt; &gt; Subject: Re: kern/56050 (xhci suspend/resume is unimplemented)<br=
 >
 &gt; &gt; Date: Wed, 26 May 2021 10:41:47 +0000<br>
 &gt; &gt;<br>
 &gt; &gt;=C2=A0 On Wed, May 26, 2021 at 09:15:02AM +0000, Andrius V wrote:<=
 br>
 &gt; &gt;=C2=A0 &gt; The following reply was made to PR kern/56050; it has =
 been noted by GNATS.<br>
 &gt; &gt;=C2=A0 &gt;<br>
 &gt; &gt;=C2=A0 &gt; From: Andrius V &lt;<a href=3D"mailto:vezhlys%gmail.co@localhost=
 m" target=3D"_blank" rel=3D"noreferrer">vezhlys%gmail.com@localhost</a>&gt;<br>
 &gt; &gt;=C2=A0 &gt; To: <a href=3D"mailto:gnats-bugs%netbsd.org@localhost"; target=3D=
 "_blank" rel=3D"noreferrer">gnats-bugs%netbsd.org@localhost</a><br>
 &gt; &gt;=C2=A0 &gt; Cc: <a href=3D"mailto:kern-bug-people%netbsd.org@localhost"; targ=
 et=3D"_blank" rel=3D"noreferrer">kern-bug-people%netbsd.org@localhost</a>, <a href=3D=
 "mailto:netbsd-bugs%netbsd.org@localhost"; target=3D"_blank" rel=3D"noreferrer">netbsd=
 -bugs%netbsd.org@localhost</a>, <a href=3D"mailto:gnats-admin%netbsd.org@localhost"; target=3D"_=
 blank" rel=3D"noreferrer">gnats-admin%netbsd.org@localhost</a>,<br>
 &gt; &gt;=C2=A0 &gt;=C2=A0 =C2=A0 =C2=A0 <a href=3D"mailto:nia%netbsd.org@localhost"; =
 target=3D"_blank" rel=3D"noreferrer">nia%netbsd.org@localhost</a>, Taylor R Campbell =
 &lt;<a href=3D"mailto:riastradh%netbsd.org@localhost"; target=3D"_blank" rel=3D"norefe=
 rrer">riastradh%netbsd.org@localhost</a>&gt;<br>
 &gt; &gt;=C2=A0 &gt; Subject: Re: kern/56050 (xhci suspend/resume is unimpl=
 emented)<br>
 &gt; &gt;=C2=A0 &gt; Date: Wed, 26 May 2021 12:09:45 +0300<br>
 &gt; &gt;=C2=A0 &gt;<br>
 &gt; &gt;=C2=A0 &gt;=C2=A0 The latest xhci.c rev 1.140 changes are causing =
 panic in my system if<br>
 &gt; &gt;=C2=A0 &gt;=C2=A0 any xhci device (or at least with my USB stick a=
 nd/or external SSD) is<br>
 &gt; &gt;=C2=A0 &gt;=C2=A0 connected. The previous commit still works (susp=
 end/resume draft).<br>
 &gt; &gt;=C2=A0 &gt;=C2=A0 Interestingly enough and probably irrelevant, th=
 e kernel does not<br>
 &gt; &gt;=C2=A0 &gt;=C2=A0 crash with serial console boot on 115200 speed (=
 still fails on default<br>
 &gt; &gt;=C2=A0 &gt;=C2=A0 9600 speed though).<br>
 &gt; &gt;=C2=A0 &gt;<br>
 &gt; &gt;<br>
 &gt; &gt;=C2=A0 xhci.c 1.140 only touches suspend/resume code, and you don&=
 #39;t seem to be<br>
 &gt; &gt;=C2=A0 suspending in this dmesg.<br>
 &gt; &gt;<br>
 &gt; &gt;=C2=A0 I suspect this is a spurious panic that happens sometimes.<=
 br>
 &gt; &gt;<br>
 
 --0000000000008cf4f505c342e006--
 


Home | Main Index | Thread Index | Old Index