NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/57580: evbarm/earmv7hf RPI2 scheduler/cpu stall
The following reply was made to PR kern/57580; it has been noted by GNATS.
From: Frank Kardel <kardel%netbsd.org@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc:
Subject: Re: kern/57580: evbarm/earmv7hf RPI2 scheduler/cpu stall
Date: Thu, 17 Aug 2023 07:45:58 +0200
So I got a partial success with DDB.
Stopped in pid 0.5 (system) at netbsd:cpu_Debugger+0x4: bx r14
db{0}> mach cpu 1
kdb_trap: switching to cpu1
[ 85689.2095237] Mutex error: mutex_vector_enter,516: assertion failed:
!cpu_intr_p()
[ 85689.2095237] lock address : 913d3940
[ 85689.2095237] current cpu : 1
[ 85689.2095237] current lwp : 0x00000000914c3400
[ 85689.2095237] owner field : 000000000000000000
wait/spin: 0/0
[ 85689.2095237] panic: lock error: Mutex: mutex_vector_enter,516:
assertion failed: !cpu_intr_p(): lock 0x913d3940 cpu 1 lwp 0x914c3400
[ 85689.2095237] cpu1: Begin traceback...
[ 85689.2095237] 0x80cb3b44: netbsd:db_panic+0x14
[ 85689.2095237] 0x80cb3b64: netbsd:vpanic+0x114
[ 85689.2095237] 0x80cb3b7c: netbsd:panic+0x24
[ 85689.2095237] 0x80cb3c44: netbsd:lockdebug_abort+0xe8
[ 85689.2095237] 0x80cb3c5c: netbsd:mutex_abort+0x30
[ 85689.2095237] 0x80cb3cc4: netbsd:mutex_enter+0x48c
[ 85689.2095237] 0x80cb3cdc: netbsd:usbd_set_polling+0x34
[ 85689.2095237] 0x80cb3cfc: netbsd:ukbd_cnpollc+0x5c
[ 85689.2095237] 0x80cb3d14: netbsd:wsdisplay_pollc+0x60
[ 85689.2095237] 0x80cb3d2c: netbsd:cnpollc+0x4c
[ 85689.2095237] 0x80cb3dbc: netbsd:kdb_trap+0x19c
[ 85689.2095237] 0x80cb3dcc: netbsd:pic_ipi_ddb+0x18
[ 85689.2095237] 0x80cb3df4: netbsd:bcm2836mp_ipi_handler+0x11c
[ 85689.2095237] 0x80cb3e44: netbsd:pic_dispatch+0x54
[ 85689.2095237] 0x80cb3ecc: netbsd:pic_do_pending_ints+0x434
[ 85689.2095237] 0x80cb3f34: netbsd:irq_idle_entry+0x38
[ 85689.2095237] 0x80cb3f94: netbsd:idle_loop+0x1b8
[ 85689.2095237] cpu1: End traceback...
[ 85689.2095237] dump to dev 92,1 not possible
[ 85689.2095237] rebooting...
Additionally getting stacks from crash got unlucky and stuck.
The currently runnable processes from ps are:
UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TTY
TIME COMMAND
0 0 0 0 222 0 0 8724 - RKl ? 32:48.28
[system]
1002 585 3387 0 85 0 164508 3084 - Rs ? 0:00.14
postgres: logical replication launcher
1002 588 3387 0 85 0 164508 3268 - Rs ? 0:02.57
postgres: autovacuum launcher
1005 594 1 0 85 0 100024 74656 - R ? 37:17.08
/usr/bin/perl /usr/pkg/fhem/fhem.pl /usr/pkg/fhem/fhem.cfg
0 736 1 0 85 0 15780 4604 - Rs ? 0:13.95
/usr/pkg/bin/perl -wT /usr/pkg/sbin/munin-node
0 969 1 0 85 0 13716 2336 - Rs ? 0:01.12
/usr/libexec/postfix/master -w
0 1249 1 0 85 0 6396 1444 - Rs ? 0:01.19
/usr/sbin/cron
0 1297 1296 0 85 0 7156 1896 select Is ? 0:00.13
SCREEN -R (screen)
0 1454 1 0 85 0 9744 1684 - Rs ? 1:13.53
/usr/sbin/syslogd -s
0 1646 1 7048 85 0 112428 10004 - Rsl ? 2:08.08
/usr/sbin/named
1002 1739 3387 0 85 0 163884 2564 - Rs ? 0:01.31
postgres: walwriter
1001 1882 1 42491 85 0 44568 3800 - Rsl ? 0:00.63
/usr/pkg/sbin/zebra -P 0 -d
1008 3288 1 102890 85 0 187872 39556 - Rsl ? 1:19.25
/usr/pkg/bin/node /usr/pkg/zigbee2mqtt/index.js
1002 3387 1 0 85 0 163600 3988 - Rs ? 0:00.57
/usr/pkg/bin/postgres -D /usr/pkg/pgsql/data
0 11982 8414 0 0 0 11476 3564 - Rs ? 0:07.43
(perl)
As Hearbeat does not seem to be supported in NetBSD-10, I try to
run a -current kernel and see how far I get.
On 08/13/23 14:00, Taylor R Campbell wrote:
> Can you enter ddb in this state (not crash(8) -- use C-A-ESC in wskbd
> or break at serial console or (set and) type the hw.cnmagic sequence),
> and do `mach cpu 1', and then `bt'? Once you get output, you can do
> `continue' to return from ddb.
>
> If not, can you try enabling `options HEARTBEAT' and `options
> HEARTBEAT_MAX_PERIOD=15' in our kernel config and see if you get any
> diagnostics out of that?
Home |
Main Index |
Thread Index |
Old Index