NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

port-evbarm/57614: Kernel panic rebooting a Pine64 RockPro64 with netbsd10.0_beta; successful after panic reboot.



>Number:         57614
>Category:       port-evbarm
>Synopsis:       Kernel panic rebooting a Pine64 RockPro64 with netbsd10.0_beta; successful after panic reboot.
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-evbarm-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Sep 10 22:05:00 +0000 2023
>Originator:     Joel P.
>Release:        NetBSD 10.0_BETA
>Organization:
(individual)
>Environment:
rock64# uname -a
NetBSD rock64 10.0_BETA NetBSD 10.0_BETA (ROCKPRO64) #0: Sat Sep  2 11:44:00 PDT 2023  joelp%cow.labdatum.net@localhost:/home/joelp/src-netbsd10/obj/sys/arch/evbarm/compile/ROCKPRO64 evbarm
>Description:
Initial power on or an initiated reboot results in a kernel panic. When the panic reboots the machine (or after cont command in db), it will complete boot after fsck. I have a serial console attached.

There are two different panics, 1) starts with usb_syncmem(), 2) starts with usb_allocmem().

#1
Starting devpubd.
[   4.5020523] panic: Trap: Data Abort (EL1): Translation Fault L0 with read access for 0000000000000000: pc ffffc0000018c6b4: ldp x0, x1, [x5]

[   4.5020523] cpu3: Begin traceback...
[   4.5020523] trace fp ffffc001557776c0
[   4.5020523] fp ffffc001557776f0 vpanic() at ffffc00000590398 netbsd:vpanic+0x178
[   4.5020523] fp ffffc00155777750 panic() at ffffc000005904a4 netbsd:panic+0x44
[   4.5120594] fp ffffc001557777e0 data_abort_handler() at ffffc000000b599c netbsd:data_abort_handler+0x1ec
[   4.5120594] tf ffffc00155777850 el1_trap() at ffffc000000b6f84 netbsd:el1_vectors+0x784
[   4.5120594] ---- Data Abort (EL1): trapframe 0xffffc00155777850 (304 bytes) ----
[   4.5120594]     pc=ffffc0000018c6b4,   spsr=0000000060000005
[   4.5120594]    esr=0000000096000004,    far=0000000000000000
[   4.5120594]     x0=ffffc0016665f7c0,     x1=0000000000000000
[   4.5120594]     x2=0000000000000000,     x3=0000000000000010
[   4.5120594]     x4=0000000000000005,     x5=0000000000000000
[   4.5120594]     x6=0000000000002000,     x7=ffff00000408f700
[   4.5220548]     x8=0000000000001f80,     x9=0000000000000008
[   4.5220548]    x10=ffffc000000abbd4,    x11=000000000322bf00
[   4.5220548]    x12=fffffc00000b4b0c,    x13=fffffc00000b4b11
[   4.5220548]    x14=0000000000000020,    x15=ffff0000030f15f8
[   4.5220548]    x16=ffffc000000abbd4,    x17=0000f51dd8e2c374
[   4.5220548]    x18=0000000000001000,    x19=ffffc0004029c000
[   4.5220548]    x20=ffff000002d2c2e0,    x21=ffffc0016665f790
[   4.5220548]    x22=ffffc0016665f6b0,    x23=ffff000004442c80
[   4.5220548]    x24=ffffc0016665f720,    x25=ffff0000004aff40
[   4.5220548]    x26=ffffc0016665f640,    x27=00000000ffffff80
[   4.5220548]    x28=ffff000004442d28, fp=x29=ffffc00155777b80
[   4.5220548] lr=x30=ffffc0000022cfe4,     sp=ffffc00155777b80
[   4.5220548] ------------------------------------------------
[   4.5220548] fp ffffc00155777b80 usb_syncmem() at ffffc0000018c6b4 netbsd:usb_syncmem+0x20
[   4.5320546] fp ffffc00155777c00 usbd_transfer() at ffffc00000184ef8 netbsd:usbd_transfer+0x218
[   4.5320546] fp ffffc00155777c40 usbd_do_request_len() at ffffc00000186978 netbsd:usbd_do_request_len+0xd8
[   4.5320546] fp ffffc00155777cb0 usbd_get_initial_ddesc() at ffffc00000187494 netbsd:usbd_get_initial_ddesc+0x34
[   4.5320546] fp ffffc00155777d20 usbd_new_device() at ffffc0000018b284 netbsd:usbd_new_device+0x214
[   4.5420541] fp ffffc00155777dc0 uhub_explore() at ffffc0000018f6dc netbsd:uhub_explore+0x4fc
[   4.5420541] fp ffffc00155777e70 usb_discover() at ffffc0000018260c netbsd:usb_discover+0x5c
[   4.5520554] fp ffffc00155777ea0 usb_event_thread() at ffffc00000182a4c netbsd:usb_event_thread+0x7c
[   4.5520554] tf ffffc00155777ed0 el0_trap() at ffffc000000b6ff0 netbsd:el1_trap_exit+0x68
[   4.5520554] cpu3: End traceback...
Stopped in pid 0.210 (system) at        netbsd:cpu_Debugger+0xc:        ldp
x29, x30, [sp],#16
db{3}>

#2
IPv6 mode: host
Configuring network interfaces:.
Adding interface aliases:.
Waiting for duplicate address detection to finish...
Starting dhcpcd.
[   4.8321866] panic: Trap: Data Abort (EL1): Translation Fault L0 with read access for 0000000000000000: pc ffffc0000018c1b0: ldr x5, [x2]

[   4.8321866] cpu5: Begin traceback...
[   4.8321866] trace fp ffffc001557e9750
[   4.8321866] fp ffffc001557e9780 vpanic() at ffffc00000590398 netbsd:vpanic+0x178
[   4.8321866] fp ffffc001557e97e0 panic() at ffffc000005904a4 netbsd:panic+0x44
[   4.8321866] fp ffffc001557e9870 data_abort_handler() at ffffc000000b599c netbsd:data_abort_handler+0x1ec
[   4.8321866] tf ffffc001557e98e0 el1_trap() at ffffc000000b6f84 netbsd:el1_vectors+0x784
[   4.8321866] ---- Data Abort (EL1): trapframe 0xffffc001557e98e0 (304 bytes) ----
[   4.8321866]     pc=ffffc0000018c1b0,   spsr=0000000020000005
[   4.8421934]    esr=0000000096000004,    far=0000000000000000
[   4.8421934]     x0=0000000000000000,     x1=ffffc00166678f00
[   4.8421934]     x2=0000000000000000,     x3=0000000000000000
[   4.8421934]     x4=ffff000003705d28,     x5=ffff0000f4b73b80
[   4.8421934]     x6=0000000000000000,     x7=ffffffffffffffff
[   4.8421934]     x8=00000000000010c0,     x9=0000000000000010
[   4.8421934]    x10=ffffc000000abbd4,    x11=0000000007f444c8
[   4.8421934]    x12=0000000007f29f58,    x13=0000000007f29f48
[   4.8421934]    x14=0000000007f29f50,    x15=0000000007f29f28
[   4.8421934]    x16=ffffc000000abbd4,    x17=0000fd9d4266c244
[   4.8421934]    x18=0000000007f29000,    x19=ffffc00000f0ee20
[   4.8421934]    x20=ffff0000f4b73b80,    x21=0000000000000008
[   4.8421934]    x22=ffffc00001315b80,    x23=ffff000003705d28
[   4.8421934]    x24=0000000000000002,    x25=0000000000000000
[   4.8421934]    x26=0000000000000002,    x27=0000000000031128
[   4.8421934]    x28=0000000000000000, fp=x29=ffffc001557e9c10
[   4.8421934] lr=x30=ffffc0000018c198,     sp=ffffc001557e9c10
[   4.8421934] ------------------------------------------------
[   4.8421934] fp ffffc001557e9c10 usb_allocmem() at ffffc0000018c1b0 netbsd:usb_allocmem+0x7c
[   4.8521885] fp ffffc001557e9c70 ohci_open() at ffffc0000022c418 netbsd:ohci_open+0x198
[   4.8521885] fp ffffc001557e9cd0 usbd_setup_pipe_flags() at ffffc0000018a9e0 netbsd:usbd_setup_pipe_flags+0xe0
[   4.8521885] fp ffffc001557e9d20 usbd_new_device() at ffffc0000018b3ac netbsd:usbd_new_device+0x33c
[   4.8521885] fp ffffc001557e9dc0 uhub_explore() at ffffc0000018f6dc netbsd:uhub_explore+0x4fc
[   4.8521885] fp ffffc001557e9e70 usb_discover() at ffffc0000018260c netbsd:usb_discover+0x5c
[   4.8621872] fp ffffc001557e9ea0 usb_event_thread() at ffffc00000182a4c netbsd:usb_event_thread+0x7c
[   4.8621872] tf ffffc001557e9ed0 el0_trap() at ffffc000000b6ff0 netbsd:el1_trap_exit+0x68
[   4.8621872] cpu5: End traceback...
[   4.8621872] rebooting...


First few lines of dmesg:
[     1.000000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003,
[     1.000000]     2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013,
[     1.000000]     2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023
[     1.000000]     The NetBSD Foundation, Inc.  All rights reserved.
[     1.000000] Copyright (c) 1982, 1986, 1989, 1991, 1993
[     1.000000]     The Regents of the University of California.  All rights reserved.

[     1.000000] NetBSD 10.0_BETA (ROCKPRO64) #0: Sat Sep  2 11:44:00 PDT 2023
[     1.000000]         joelp@cow:/home/joelp/src-netbsd10/obj/sys/arch/evbarm/compile/ROCKPRO64
[     1.000000] total memory = 3914 MB
[     1.000000] avail memory = 3772 MB
>How-To-Repeat:
Use a Pine64 4GB RockPro64 with a recent netbsd-10 (mine is Sep 02).

With a serial console attached or with HDMI monitor attached, you can see the kernel panic every other reboot. It's either a power on (from full power off) or from a (for example) shutdown -r now.

You will see the panic and it will then reboot or halt in debugger. After the automatic reboot or after cont command in debugger, the kernel successfully boots and the system runs correctly.

I only recently noticed the panic (it just seemed to take longer to boot), but unfortunately, I can't put a date on a build when it first started.

Thanks - Joel
>Fix:
No known fix with recent kernel builds as of Sep 10.

Is it possible the second reboot pauses long enough during fsck to avoid a race condition causing the panic?



Home | Main Index | Thread Index | Old Index