Re: Problems with FTDI adapter and evbarm devices

To: Greg Troxel <gdt%lexort.com@localhost>, port-arm%NetBSD.org@localhost
Subject: Re: Problems with FTDI adapter and evbarm devices
From: Jason Mitchell <jmitchel%bigjar.com@localhost>
Date: Sun, 31 May 2020 16:55:49 -0400

On 5/31/20 6:49 AM, Greg Troxel wrote:

Jason Mitchell <jmitchel%bigjar.com@localhost> writes:

There something that causes frequent crashes when using an FTDI USB to
serial converter with a Raspberry Pi 3B (and maybe other evbarm
devices). The easiest way to reproduce this bug is to:

1) Insert an FTDI cable
2) Run minimum and connect to the ucom port associated with the FTDI adapter
3) Remove the FTDI cable

I could NOT make this happen with an aarch64 machine (Libre LePotato),
but I did not extensively test.

I have seen crashes on NetBSD-8 when disconnecting at least one kind of
USB/serial adaptor while it was opened by a program.  So I am not at all
sure that this is an arm problem, but I am of course not sure that it isn't.

Among the things I forgot to mention is that I've seen crashes whenusing the FTDI adapter normally -- that is, not unplugging a device openby a running app . It would usually go, connect, use, disconnect,connect , then crash. And it doesn't happen with a Libre Le Potato boardrunning aarch64, at least at first glance. I'll test more with that tosee if it matters.


The Le Potato board is running:

Kyle:~# uname -a

NetBSD Kyle 9.0_STABLE NetBSD 9.0_STABLE (GENERIC64) #0: Mon May 1819:07:35 UTC 2020mkrepro%mkrepro.NetBSD.org@localhost:/usr/src/sys/arch/evbarm/compile/GENERIC64 evbarm

The console output is below. I caused the crash twice to make sure it
was reproducible.

Could someone let me know what the next steps are to troubleshoot this?

Speculating from experience, what happens as devices are removed is
various bits of state are deallocated, and this can result in dangling
pointers from other state if it is not done exactly right.  This is very
tricky to get right.  It is necessary to find out what went wrong and
then it's usually fairly easy to fix.

Sometimes devices set a variable to indicate that they are being torn
down and all other code is supposed to check that and return an error
rather than access anything else.  I would suggest reading the ucom and
uftdi driver code if you are up for it.

You are dropping into ddb.  Rather than c for continue, which you have
established doesn't work :-), do bt for backtrace.  See ddb(4) for more
instructions.  What you are trying to do is find out the very first
instruction that faulted, and then find what source line that
corresponds to.

So do this again, run bt, and see what the address is of the last frame
before the trap.  Or just post the backtrace.   Also explain what kernel
version you are running and where you got it (downloaded from X, built
yourself, data of sources, branch, etc.).

I am somewhat hesitant to believe the traceback after continue, but it
seems that continue after a fault prints the backtrace and reboots.  It
looks like ucomopen+0x58.   I wonder if your terminal program is going
close/open when it gets an error, and it is catching the device in a
half-closed state.

I would try kermit or cu or something different and see if you can
provoke the crash there.  Or disable any auto close/open behavior if you
can figure out that and see if that makes it not crash.  (I don't mean
that there isn't a bug; just the more narrowly we can characterize it,
the easier it is to find.)

[ 185922.2110825] 0x812afc24: netbsd:address_exception_entry+0x5c
[ 185922.2210826] 0x812afc94: netbsd:ucomopen+0x58
[ 185922.2210826] 0x812afd0c: netbsd:spec_open+0x20c
[ 185922.2310842] 0x812afd34: netbsd:VOP_OPEN+0x44
[ 185922.2310842] 0x812afe0c: netbsd:vn_open+0x200
[ 185922.2410844] 0x812afe8c: netbsd:do_open+0xac
[ 185922.2410844] 0x812afec4: netbsd:do_sys_openat+0xa4
[ 185922.2510843] 0x812afeec: netbsd:sys_open+0x38
[ 185922.2510843] 0x812affac: netbsd:syscall+0x12c

So this looks like your minicom program did the open syscall which got
passed down to ucomopen, which faulted at the instruction at 0x58.
This needs to be associated with a C line, which needs your specific
kernel information and maybe debug info from it.

Also, if you have a different chipset serial adaptor, it would be useful
to see if that faults, or not.  (Or if anyone else on the list has FTDI
and something else.)

I'll look for a different serial adapter, thanks. And the board that hasthe crashes is using:

uname -a: NetBSD xxxx.bigjar.com 9.0_STABLE NetBSD 9.0_STABLE (RPI2) #0:Thu May 21 10:53:23 UTC 2020mkrepro%mkrepro.NetBSD.org@localhost:/usr/src/sys/arch/evbarm/compile/RPI2 evbarm

I hope that the uname above is enough to identify it as I can't seem tofind the image file I used. The files on disk are dated 2020/4/21. Myhistory shows me going tohttp://nyftp.netbsd.org/pub/NetBSD-daily/netbsd-9/latest/evbarm-earmv6hf/binary/kernel/on 2020/5/23.


--

Thanks!

*Jason Mitchell*
bigjar systems
5305 Village Center Drive | Suite 127 | Columbia, MD 21044

www.bigjar.com <http://www.bigjar.com> p: 443.430.9739 | f: 443-583-0289| c: 410-921-0272“THINK GREEN…and print this email only if necessary. Doing the RightThings to Make a Difference”.

References:
- Problems with FTDI adapter and evbarm devices
  - From: Jason Mitchell
- Re: Problems with FTDI adapter and evbarm devices
  - From: Greg Troxel

Prev by Date: imx6_board.c and device_properties() usage
Next by Date: Re: imx6_board.c and device_properties() usage
Previous by Thread: Re: Problems with FTDI adapter and evbarm devices
Next by Thread: imx6_board.c and device_properties() usage
Indexes:

Home | Main Index | Thread Index | Old Index