Port-mips archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Anyone working on Octeon SMP?



On Monday, 13 April 2026 12:58:34 EDT Kevin Bowling wrote:
> On Fri, Apr 10, 2026 at 11:45 PM Nick Hudson <nick.hudson%gmx.co.uk@localhost> wrote:
> > On 10/04/2026 11:18, Kevin Bowling wrote:
> > > On Mon, Apr 6, 2026 at 6:42 AM Nick Hudson <nick.hudson%gmx.co.uk@localhost> 
wrote:
> > >> On 06/04/2026 10:49, Kevin Bowling wrote:
> > >>> On Sat, Mar 28, 2026 at 8:28 AM Andrew Parker <andrew%pmk1.net@localhost> wrote:
> > >>>> On Thursday, 19 March 2026 23:31:36 EDT Kevin Bowling wrote:
> > >>>>> On Sun, Feb 2, 2025 at 6:52 AM Andrew Parker <andrew%pmk1.net@localhost> 
wrote:
> > >>>>>> On 1/29/25 06:23, Nick Hudson wrote:
> > >>>>>>> On 20/12/2024 22:01, Andrew Parker wrote:
> > >>>>>>>> Hi, I've been working towards getting a MULTIPROCESSOR build more
> > >>>>>>>> stable
> > >>>>>>>> on my ER4 and would like to know if anyone else may be working on
> > >>>>>>>> the
> > >>>>>>>> same?
> > >>>>>>>> 
> > >>>>>>>> I've found a few areas that were causing instability on my
> > >>>>>>>> machine and
> > >>>>>>>> have some patches that help (but mostly just for debugging at
> > >>>>>>>> this
> > >>>>>>>> point). If there's any interest in teaming up and exchanging
> > >>>>>>>> ideas or
> > >>>>>>>> patches for SMP on Octeon please let me know.
> > >>>>>>> 
> > >>>>>>> Sure.
> > >>>>>>> 
> > >>>>>>> I've dropped the ball on this and said I had a couple of fixes in
> > >>>>>>> mind,
> > >>>>>>> but done nothing to share them. Hopefully we can make it stable.
> > >>>>>>> 
> > >>>>>>> Nick
> > >>>>>> 
> > >>>>>> Great!  I'm curious about what you have in mind for improvement. 
> > >>>>>> I've
> > >>>>>> mostly been looking around TLB invalidation and perhaps a missing
> > >>>>>> memory
> > >>>>>> barrier.
> > >>>>>> 
> > >>>>>> Anyway, I'll work on getting some stuff cleaned up and contact
> > >>>>>> directly
> > >>>>>> if that works for you.
> > >>>>> 
> > >>>>> I'm interested in this as well, is there anything to share around
> > >>>>> current status or issues as well as if anything is pending out of
> > >>>>> tree?
> > >>>> 
> > >>>> I hoped to spend more time on this over the winter but ended up
> > >>>> moving and
> > >>>> some of my networking gear is still packed up.
> > >>>> 
> > >>>> It's been a slow process getting my test environment setup again but
> > >>>> it would be great to pick this back up...especially if there's
> > >>>> continued interest in it.
> > >>>> 
> > >>>> Give me a week or two to see what I can dig up and I'll be in touch.
> > >>> 
> > >>> I've made some progress but it turned into a much deeper hole than I
> > >>> was anticipating.  I can increase SMP stability with some changes in
> > >>> pmap_tlb.c to add some icache syncs
> > >> 
> > >> I have a change I'll commit soon to handle EXECness (more) correctly.
> > 
> > This is committed
> > 
> > https://mail-index.netbsd.org/source-changes/2026/04/10/msg161522.html
> >
> >m
> >
> > >> Am I right in thinking at least some octeon processors icaches are
> > >> VIVT?
> > >> I've forgotten most of the mips stuff I knew... If so there will be
> > >> more flushing required for VIVT.
> > > 
> > > VIPT.  But it has an assortment of fun issues.
> > 
> > fun issues?
> 
> It seems to require pretty deliberate management that is different
> than other MIPS and archs.
> 
> I was able to get SMP stabilized to my own satisfaction last night, in
> so far as it can hold up cnmac driver modifications which is what I
> was originally started with.
> 
> There are a few categories of mandatory fixes:  membar_release is
> missing a SYNC_PLUNGER (second syncw), INT_MASKs critically missing in
> octeon_intr.c, and a variety of locore fixups.  Beyond that I ended up
> redoing octeon_intr.c to distribute interrupts and fix my previous
> octeon III patch.  I have changes to the PMAP_TLB_NEED_SHOOTDOWN path
> primarily that I am least confident about.  I'll try and organize
> everything into a more deliberate patch series, right now it is pretty
> messy with attempts and debugging.

I was finally able to boot up my ER-4 last night.  I don't have anything to add 
(yet) to the pmap discussion here except I'm guessing I'm in similar place you 
are with it. Things are 'stable' mostly through a bunch of extra TLB flushes.

The other change that resulted in a more stable userland for me is what 
appears to be a missing memory barrier around cpu_lwp_setprivate().  The one I 
added in lwp.c doesn't seem 100% correct but this does resolve instability 
that's easily reproduced in unbound (using multiple threads) and occasionally 
sshd:

 diff --git a/sys/arch/mips/mips/cpu_subr.c b/sys/arch/mips/mips/cpu_subr.c
index a80304908774..df854cae4254 100644
--- a/sys/arch/mips/mips/cpu_subr.c
+++ b/sys/arch/mips/mips/cpu_subr.c
@@ -1051,11 +1051,11 @@ cpu_vmspace_exec(lwp_t *l, vaddr_t start, vaddr_t end)
 int
 cpu_lwp_setprivate(lwp_t *l, void *v)
 {
-
 #if (MIPS32R2 + MIPS64R2) > 0
        if (l == curlwp && MIPS_HAS_USERLOCAL) {
                mipsNN_cp0_userlocal_write(v);
        }
+       membar_sync();
 #endif
        return 0;
 }
diff --git a/sys/kern/sys_lwp.c b/sys/kern/sys_lwp.c
index 7c4e4f27ad23..24cc3315f3e4 100644
--- a/sys/kern/sys_lwp.c
+++ b/sys/kern/sys_lwp.c
@@ -187,7 +187,7 @@ sys__lwp_self(struct lwp *l, const void *v, register_t 
*retval)
 int
 sys__lwp_getprivate(struct lwp *l, const void *v, register_t *retval)
 {
-
+        membar_sync();
        *retval = (uintptr_t)l->l_private;
        return 0;
 }

Now that I have my testing environment mostly ready I'm happy to help test any 
patches.

> > >>> and guarding around an assert that
> > >>> is easy to trigger
> > >>> @@ -735,7 +767,9 @@ pmap_tlb_shootdown_bystanders(pmap_t pm)
> > >>> 
> > >>>                         * And best of all, we avoid an IPI.
> > >>>                         */
> > >>>                        
> > >>>                        KASSERT(!kernel_p);
> > >>> 
> > >>> -                       pmap_tlb_pai_reset(ti, pai, pm);
> > >>> +                       if (pai->pai_asid > KERNEL_PID) {
> > >>> +                               pmap_tlb_pai_reset(ti, pai, pm);
> > >>> +                       }
> > >> 
> > >> you mean this KASSERT in pmap_tlb_pai_reset?
> > > 
> > > Yeah.  But let me think harder about this.
> > > 
> > >>    252  /*
> > >>    253   * We must have an ASID but it must not be onproc (on a
> > >>    processor).
> > >>    254   */
> > >>    255  KASSERT(pai->pai_asid > KERNEL_PID);
> > >>> 
> > >>> I think there are a variety of pmap changes needed.  But I wonder if
> > >>> any MIPS SMP or other PMAP_TLB_NEED_SHOOTDOWN has been heavily
> > >>> exercised?
> > >> 
> > >> Almost certainly not.
> > > 
> > > Good to know, I was a bit shy about looking at the MI pmap at first.
> > > 
> > >>> IPI and TLB stuff is probably "simpler" on ARM.  On
> > >>> FreeBSD (13), OpenBSD, Linux the MIPS TLB shootdowns are synchronous.
> > >> 
> > >> No other NetBSD architecture (not sure of powerpc booke status
> > >> actually) uses the PMAP_TLB_NEED_SHOOTDOWN stuff.
> > >> 
> > >> I'm about to switch aarch64 to sys/uvm/pmap which uses architecture
> > >> defined
> > >> broadcast TLB operations. RISC-V uses SBI remote fence operations.
> > > 
> > > Arch defined broadcast sounds very suitable for what we'll need to do
> > > here.  Is that out of tree?
> > 
> > Arm architecture defined the broadcast TLB operations long ago. Others
> > have caught up...
> > 
> > MI PMAP uses tlb_* functions
> > https://nxr.netbsd.org/xref/src/sys/arch/aarch64/aarch64/aarch64_tlb.c#84
> > https://nxr.netbsd.org/xref/src/sys/arch/aarch64/aarch64/cpufunc_asm_armv8
> > .S#221



Home | Main Index | Thread Index | Old Index