NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: port-i386/55113: Vortex86EX2 based board crashes on reboot



Hi,

I just want to update on the issue that a more recent kernel fails on
boot, not only shutdown/reboot anymore (which seems logical to me).
The reason is still the same, IRQ conflict between statically assigned
serial (com*)  port and either USB (which uses IRQ5) or sometimes with
some miniPCIe plugin cards (IRQ10 specifically was the issue).
Recompiling kernel with correct com IRQs or removing serial with
conflicting IRQ alltogether makes system bootable and stable. Current
message after the crash:

kernel: supervisor trap page fault, code=0
Stopped in pid 0.0 (system) at  netbsd:intr_establish_xname+0x3e1:      movl
104c(%edx),%eax

Again, if a crash is expected with the IRQ conflict, the bug can be
closed, since I have a workaround and it is system specific issue.

Unrelated to this, once I found my system in ddb after I connected to
it through minicom. It was basically idle for two days (network and
COM cable connected), backtrace was like this:

breakpoint(c142b620,3f8,5,7,47278b,c2d9b1f4,c2d9b16c,c2b4e000,800,c63f7f6c)
at netbsd:breakpoint+0x4
comintr(c2d9b040,da337f30,0,0,0,0,0,0,0,0) at netbsd:comintr+0x8c0
--- switch to interrupt stack ---
Xintr_legacy4() at netbsd:Xintr_legacy4+0xda
--- interrupt ---
x86_stihlt(c2aeb040,0,c2aeb040,c01020f3,c2aeb040,c2aeb040,c0c4e120,c2aeb040,0,c0102011)
at netbsd:x86_stihlt+0x5
idle_loop(c2aeb040,1705000,1710000,0,c01005a8,0,0,0,0,0) at
netbsd:idle_loop+0x122

Haven't reproduced it second time yet, thus I will refrain from
registering a new bug, but posting it, if any clues can be determined
from this.

Regards,
Andrius V

On Tue, May 5, 2020 at 11:15 PM Andrius V <vezhlys%gmail.com@localhost> wrote:
>
> The following reply was made to PR port-i386/55113; it has been noted by GNATS.
>
> From: Andrius V <vezhlys%gmail.com@localhost>
> To: port-i386-maintainer%netbsd.org@localhost, gnats-admin%netbsd.org@localhost,
>         netbsd-bugs%netbsd.org@localhost, gnats-bugs%netbsd.org@localhost
> Cc:
> Subject: Re: port-i386/55113: Vortex86EX2 based board crashes on reboot
> Date: Tue, 5 May 2020 23:12:35 +0300
>
>  As a "bonus" this patch can be applied to add GENESYS GL850G USB 2.0
>  hub controller to knows USB devices, since this is the chip which is
>  soldered on the board (patch didn't actually change anything in dmesg
>  messages but as I understand usbdevs comments, it serves as
>  "informational" file mostly):
>
>  diff --git a/sys/dev/usb/usbdevs b/sys/dev/usb/usbdevs
>  index 3248aca89fd..9c712628680 100644
>  --- a/sys/dev/usb/usbdevs
>  +++ b/sys/dev/usb/usbdevs
>  @@ -1668,6 +1668,7 @@ product GENERALINSTMNTS SB5100    0x5100
>  SURFboard SB5100 Cable modem
>   /* Genesys Logic products */
>   product GENESYS GENELINK       0x05e3  GeneLink Host-Host Bridge
>   product GENESYS GL650          0x0604  GL650 Hub
>  +product GENESYS GL850G         0x0608  GL850G USB 2.0 STT Hub Controller
>   product GENESYS GL641USB       0x0700  GL641USB CompactFlash Card Reader
>   product GENESYS GL641USB2IDE_2 0x0701  GL641USB USB-IDE Bridge
>   product GENESYS GL641USB2IDE   0x0702  GL641USB USB-IDE Bridge
>
>  dmesg:
>  [     3.503792] uhub2 at uhub1 port 1: vendor 05e3 (0x05e3) USB2.0 Hub
>  (0x0608), class 9/0, rev 2.00/60.70, addr 2
>  [     3.544976] uhub2: single transaction translator
>  [     3.570092] uhub2: 4 ports with 4 removable, self powered
>
>  On Tue, May 5, 2020 at 10:45 PM Andrius V <vezhlys%gmail.com@localhost> wrote:
>  >
>  > The following reply was made to PR port-i386/55113; it has been noted by GNATS.
>  >
>  > From: Andrius V <vezhlys%gmail.com@localhost>
>  > To: port-i386-maintainer%netbsd.org@localhost, gnats-admin%netbsd.org@localhost,
>  >         netbsd-bugs%netbsd.org@localhost, gnats-bugs%netbsd.org@localhost
>  > Cc:
>  > Subject: Re: port-i386/55113: Vortex86EX2 based board crashes on reboot
>  > Date: Tue, 5 May 2020 22:42:21 +0300
>  >
>  >  The crash on reboot and failing USB were caused by IRQ conflict.
>  >  GENERIC kernel is configured to assign IRQ 5 for com2 (this board has
>  >  5 COM ports) but IRQ 5 is used for USB only in this board. According
>  >  to board manual, other COM ports uses IRQ 3,4, 10 and 11. So, I tested
>  >  the the custom kernel with properly assigned IRQ values on com ports,
>  >  USB starts to work and system doesn't crash on reboot anymore. Unless
>  >  something still needs to be done to avoid crash in this specific
>  >  situation, this PR can be considered solved.
>  >
>  >  Regards,
>  >  Andrius V
>  >
>  >  On Thu, Apr 2, 2020 at 12:30 AM Andrius V <vezhlys%gmail.com@localhost> wrote:
>  >  >
>  >  > The following reply was made to PR port-i386/55113; it has been noted by GNATS.
>  >  >
>  >  > From: Andrius V <vezhlys%gmail.com@localhost>
>  >  > To: port-i386-maintainer%netbsd.org@localhost, gnats-admin%netbsd.org@localhost,
>  >  >         netbsd-bugs%netbsd.org@localhost, gnats-bugs%netbsd.org@localhost,
>  >  >         Martin Husemann <martin%duskware.de@localhost>
>  >  > Cc:
>  >  > Subject: Re: port-i386/55113: Vortex86EX2 based board crashes on reboot
>  >  > Date: Thu, 2 Apr 2020 00:26:19 +0300
>  >  >
>  >  >  Hi,
>  >  >
>  >  >  I probably found some clue on DEBUG enabled kernel boot with -vx
>  >  >  options. There's a print output on pin5: "intr_establish_xname: pic
>  >  >  pic0 pin 5: can't share type 3 with 2". It seems like pin5 is related
>  >  >  to ohci/ehci since their attachment follows right after it's
>  >  >  allocation.
>  >  >
>  >  >  Looking at the code intr_establish_xname returns NULL right after the printf:
>  >  >  https://github.com/NetBSD/src/blob/cd62c3173cefc33a9a9d183c302f2df02ad15ec4/sys/arch/x86/x86/intr.c#L913
>  >  >
>  >  >  case IST_PULSE:
>  >  >  if (type != IST_NONE) {
>  >  >  intr_source_free(ci, slot, pic, idt_vec);
>  >  >  intr_free_io_intrsource_direct(chained);
>  >  >  mutex_exit(&cpu_lock);
>  >  >  kmem_free(ih, sizeof(*ih));
>  >  >  printf("%s: pic %s pin %d: can't share "
>  >  >  "type %d with %d\n",
>  >  >  __func__, pic->pic_name, pin,
>  >  >  source->is_type, type);
>  >  >  return NULL;
>  >  >  }
>  >  >  break;
>  >  >
>  >  >  Subsequent panic happens on intr_disestablish_xcall check on null:
>  >  >  https://github.com/NetBSD/src/blob/cd62c3173cefc33a9a9d183c302f2df02ad15ec4/sys/arch/x86/x86/intr.c#L1181
>  >  >
>  >  >  if (q == NULL) {
>  >  >  x86_write_psl(psl);
>  >  >  panic("%s: handler not registered", __func__);
>  >  >  /* NOTREACHED */
>  >  >  }
>  >  >
>  >  >  Not sure if the "q" variable is really the return value from establish
>  >  >  but I would guess it is related. It's probably more of consequence
>  >  >  than a cause, but maybe it can give a some information on why it may
>  >  >  have happened, especially what may have caused intr_establish_xname
>  >  >  issue?
>  >  >
>  >  >  relevant dmesg parts:
>  >  >  ....
>  >  >  ohci0 at pci0 dev 10 function 0: RDC Semiconductor R6060 USB OHCI (rev. 0x15)
>  >  >  csr: 02000006
>  >  >  allocated pic pic0 type level pin 5 level 6 to cpu0 slot 5 idt entry 37
>  >  >  ohci0: interrupting at irq 5
>  >  >  ohci0: OHCI version 1.0, legacy support
>  >  >  usb0 at ohci0: USB revision 1.0
>  >  >  ehci0 at pci0 dev 10 function 1: RDC Semiconductor R6061 USB EHCI (rev. 0x09)
>  >  >  allocated pic pic0 type level pin 5 level 6 to cpu0 slot 5 idt entry 37
>  >  >  ehci0: interrupting at irq 5
>  >  >  ehci0: EHCI version 1.0
>  >  >  ehci0: 1 companion controller, 2 ports: ohci0
>  >  >  usb1 at ehci0: USB revision 2.0
>  >  >  ...
>  >  >  intr_establish_xname: pic pic0 pin 5: can't share type 3 with 2
>  >  >  ...
>  >  >  uhub0 at usb0: NetBSD (0x0000) OHCI root hub (0x0000), class 9/0, rev
>  >  >  1.00/1.00, addr 1
>  >  >  uhub0: 2 ports with 2 removable, self powered
>  >  >  uhub1 at usb1: NetBSD (0x0000) EHCI root hub (0x0000), class 9/0, rev
>  >  >  2.00/1.00, addr 1
>  >  >  uhub1: 2 ports with 2 removable, self powered
>  >  >  ...
>  >  >  ehci_sync_hc: timed out
>  >  >  ehci_sync_hc: timed out
>  >  >  ehci_sync_hc: timed out
>  >  >  ehci_sync_hc: timed out
>  >  >  ehci_sync_hc: timed out
>  >  >  ehci_sync_hc: timed out
>  >  >  ehci_sync_hc: timed out
>  >  >  ehci_sync_hc: timed out
>  >  >  ehci_sync_hc: timed out
>  >  >  ehci_sync_hc: timed out
>  >  >  ehci_sync_hc: timed out
>  >  >  ...
>  >  >  On Sat, Mar 28, 2020 at 1:45 AM Andrius V <vezhlys%gmail.com@localhost> wrote:
>  >  >  >
>  >  >  > The following reply was made to PR port-i386/55113; it has been noted by GNATS.
>  >  >  >
>  >  >  > From: Andrius V <vezhlys%gmail.com@localhost>
>  >  >  > To: gnats-bugs%netbsd.org@localhost, Martin Husemann <martin%duskware.de@localhost>
>  >  >  > Cc: port-i386-maintainer%netbsd.org@localhost, gnats-admin%netbsd.org@localhost,
>  >  >  >         netbsd-bugs%netbsd.org@localhost
>  >  >  > Subject: Re: port-i386/55113: Vortex86EX2 based board crashes on reboot
>  >  >  > Date: Sat, 28 Mar 2020 01:42:32 +0200
>  >  >  >
>  >  >  >  Tried rebooting with Vortex86DX3 in ACPI mode, it didn't crash, so
>  >  >  >  problems between two SoCs are very likely unrelated. EHCI timeout in
>  >  >  >  VortexEX2 coming even without anything attached (unless board has
>  >  >  >  something hidden, but I don't see any candidate). Plus, I also
>  >  >  >  realized from dmesg messages that Vortex86EX probably doesn't have
>  >  >  >  ACPI enabled (or ACPI support at all), thus those ACPI errors are
>  >  >  >  coming from this. So, possibly error messages are related to eventual
>  >  >  >  crash on reboot. Will try to enable some debugging to see if anything
>  >  >  >  more will come up.
>  >  >  >
>  >  >  >  On the side note is there any ideas on the best way to change
>  >  >  >  cpu_probe_vortex86 to include 0x38504d44 as EX2? Thanks.
>  >  >  >
>  >  >  >
>  >  >  >  On Fri, Mar 27, 2020 at 12:00 PM Andrius V <vezhlys%gmail.com@localhost> wrote:
>  >  >  >  >
>  >  >  >  > Actually I included full dmesg in bug report unless something more may
>  >  >  >  > come with different build options? Except number of "ehci_sync_hc:
>  >  >  >  > timed out" messages everything else seems to be as "usual". This
>  >  >  >  > message comes only if something (non keyboard) is physically attached
>  >  >  >  > to USB (and it fails to attach). These messages probably caused by the
>  >  >  >  > same issue as PR 53894 in Vortex86DX3 (as much as I tried to
>  >  >  >  > investigate last year, USB transfer  was timing out, which in turn
>  >  >  >  > calls ehci_sync_hc and it is timing out as well). Possibly reboot
>  >  >  >  > issue is related and this bug may be a duplicate. I should try to
>  >  >  >  > somehow to reproduce this on Vortex86DX3 in ACPI mode too
>  >  >  >  > (unfortunately, on ACPI mode I've never done proper reboot/shutdown
>  >  >  >  > since both network and USB doesn't work and I can't interact with
>  >  >  >  > system in any way and system is autopower without power button, but I
>  >  >  >  > probably can schedule a cron job).
>  >  >  >  >
>  >  >  >  > Parts from dmesg:
>  >  >  >  >
>  >  >  >  > ohci0 at pci0 dev 10 function 0: RDC Semiconductor R6060 USB OHCI (rev. 0x15)
>  >  >  >  > ohci0: interrupting at irq 5
>  >  >  >  > ohci0: OHCI version 1.0, legacy support
>  >  >  >  > usb0 at ohci0: USB revision 1.0
>  >  >  >  > ehci0 at pci0 dev 10 function 1: RDC Semiconductor R6061 USB EHCI (rev. 0x09)
>  >  >  >  > ehci0: interrupting at irq 5
>  >  >  >  > ehci0: EHCI version 1.0
>  >  >  >  > ehci0: 1 companion controller, 2 ports: ohci0
>  >  >  >  > usb1 at ehci0: USB revision 2.0
>  >  >  >  > ....
>  >  >  >  > uhub0 at usb0: NetBSD (0x0000) OHCI root hub (0x0000), class 9/0, rev
>  >  >  >  > 1.00/1.00, addr 1
>  >  >  >  > uhub0: 2 ports with 2 removable, self powered
>  >  >  >  > uhub1 at usb1: NetBSD (0x0000) EHCI root hub (0x0000), class 9/0, rev
>  >  >  >  > 2.00/1.00, addr 1
>  >  >  >  > uhub1: 2 ports with 2 removable, self powered
>  >  >  >  > ...
>  >  >  >  > ehci_sync_hc: timed out
>  >  >  >  > ehci_sync_hc: timed out
>  >  >  >  > ehci_sync_hc: timed out
>  >  >  >  > ...
>  >  >  >  >
>  >  >  >  >
>  >  >  >  > On Fri, Mar 27, 2020 at 9:05 AM Martin Husemann <martin%duskware.de@localhost> wrote:
>  >  >  >  > >
>  >  >  >  > > The following reply was made to PR port-i386/55113; it has been noted by GNATS.
>  >  >  >  > >
>  >  >  >  > > From: Martin Husemann <martin%duskware.de@localhost>
>  >  >  >  > > To: gnats-bugs%netbsd.org@localhost
>  >  >  >  > > Cc:
>  >  >  >  > > Subject: Re: port-i386/55113: Vortex86EX2 based board crashes on reboot
>  >  >  >  > > Date: Fri, 27 Mar 2020 08:01:52 +0100
>  >  >  >  > >
>  >  >  >  > >  Can you show the dmesg part(s) about ehci attching? I guess there is an
>  >  >  >  > >  error message in there.
>  >  >  >  > >
>  >  >  >  > >  Martin
>  >  >  >  > >
>  >  >  >
>  >  >
>  >
>


Home | Main Index | Thread Index | Old Index