Re: CURRENT broken on Raspberry Pi 2?

To: Nick Hudson <nick.hudson%gmx.co.uk@localhost>
Subject: Re: CURRENT broken on Raspberry Pi 2?
From: Michael van Elst <mlelstv%serpens.de@localhost>
Date: Mon, 11 Feb 2019 20:41:24 +0100

On Mon, Feb 11, 2019 at 05:20:30PM +0000, Nick Hudson wrote:

> > But I got some strange crashes that I tracked to invalid cache handling. The
> > problem might even be older but might haven been exposed by the recent changes
> > to the cpu startup.
> 
> What invalid cache handling?

I got crashes in the undefined instruction handler when running
netsurf-gtk (which probes for various instruction sets).

First I saw that the neon handler was installed 4 times because vfp_init
is called by every core and each installs the handler. This can be fixed
with:

Index: vfp/vfp_init.c
===================================================================
RCS file: /cvsroot/src/sys/arch/arm/vfp/vfp_init.c,v
retrieving revision 1.60
diff -p -u -r1.60 vfp_init.c
--- vfp/vfp_init.c      27 Jan 2019 02:08:37 -0000      1.60
+++ vfp/vfp_init.c      11 Feb 2019 19:38:39 -0000
@@ -394,7 +394,7 @@ vfp_attach(struct cpu_info *ci)
        install_coproc_handler(VFP_COPROC, vfp_handler);
        install_coproc_handler(VFP_COPROC2, vfp_handler);
 #ifdef CPU_CORTEX
-       if (cpu_neon_present)
+       if (cpu_neon_present && CPU_IS_PRIMARY(ci))
                install_coproc_handler(CORE_UNKNOWN_HANDLER, neon_handler);
 #endif
 }

But this is unrelated to the cash.

The fifth undefined instruction handler was the ddb trap handler db_uh.
It is initialized very early, but somewhen the uh_handler element contains
NULL while the predecessor pointer is correct.

I first thought something was trashing the db_uh structure, but the real
cause is that the cache hasn't been written back to memory and is then
invalidated somewhere before the neon handler is added (which writes
the predecessor pointer by adding a second handler).

Flushing the cache after installing the ddb trap handler helps:

Index: db_interface.c
===================================================================
RCS file: /cvsroot/src/sys/arch/arm/arm32/db_interface.c,v
retrieving revision 1.58
diff -p -u -r1.58 db_interface.c
--- db_interface.c      28 May 2018 21:05:00 -0000      1.58
+++ db_interface.c      11 Feb 2019 19:34:56 -0000
@@ -334,6 +334,7 @@ db_machine_init(void)
         */
        db_uh.uh_handler = db_trapper;
        install_coproc_handler_static(CORE_UNKNOWN_HANDLER, &db_uh);
+       cpu_dcache_wbinv_all();
 }
 #endif

That does of course not solve the real problem of invalidating the
cache later, but I haven't found where this is done.

One speciality with this issue was that I could only cause the crash
when a specific 'hat' (SPI TFT display) was installed. The only effect
should be that the firmware augments the FDT for the GPIO usage.
No idea why this influences the cache write back.

Greetings,
-- 
                                Michael van Elst
Internet: mlelstv%serpens.de@localhost
                                "A potential Snark may lurk in every tree."

References:
- CURRENT broken on Raspberry Pi 2?
  - From: Herbert J. Skuhra
- Re: CURRENT broken on Raspberry Pi 2?
  - From: Herbert J. Skuhra
- Re: CURRENT broken on Raspberry Pi 2?
  - From: Herbert J. Skuhra
- Re: CURRENT broken on Raspberry Pi 2?
  - From: Michael van Elst
- Re: CURRENT broken on Raspberry Pi 2?
  - From: Herbert J. Skuhra
- Re: CURRENT broken on Raspberry Pi 2?
  - From: Michael van Elst
- Re: CURRENT broken on Raspberry Pi 2?
  - From: Nick Hudson

Prev by Date: Re: CURRENT broken on Raspberry Pi 2?
Next by Date: Re: NetBSD 6.0 and earmv7hf
Previous by Thread: Re: CURRENT broken on Raspberry Pi 2?
Next by Thread: Re: CURRENT broken on Raspberry Pi 2?
Indexes:

Home | Main Index | Thread Index | Old Index