Subject: Re: SIR Reset with todays sources
To: Tobias Nygren <tnn+nbsd@nygren.pp.se>
From: Juergen Hannken-Illjes <hannken@eis.cs.tu-bs.de>
List: port-sparc64
Date: 03/25/2007 18:09:45
On Sat, Mar 24, 2007 at 10:28:15PM +0100, Tobias Nygren wrote:
> Juergen Hannken-Illjes wrote:
> 
> [...]
> >At least `curcpu() == CPUINFO_VA->ci_self' is bogus.
> >0x13f2d20 is at pmap::ctx_free
> >   
> Hi everyone,
> 
> I suspect these SIRs are caused by an as-of-yet unknown concurrency
> problem in the pmap module. This patch is mindboggling and might be
> special magic that only works on my setup, but I haven't had any SIR
> resets so far. Without this I don't even get past fsck.
> 
> Index: pmap.c
> ===================================================================
> RCS file: /cvsroot/src/sys/arch/sparc64/sparc64/pmap.c,v
> retrieving revision 1.187
> diff -u -r1.187 pmap.c
> --- pmap.c	12 Mar 2007 18:18:28 -0000	1.187
> +++ pmap.c	24 Mar 2007 21:19:04 -0000
> @@ -1846,6 +1846,7 @@
> 	if (pm == pmap_kernel()) {
> 		return;
> 	}
> +	DELAY(20000);
> 	stxa(CTX_SECONDARY, ASI_DMMU, 0);
> 	pm->pm_refs = 0;
> 	ctx_free(pm);

Same here.  With this delay the machine comes up to multi-user.

Digging around I found this section of Sun Document 802-7220-02 titled
"UltraSPARC User's Manual UltraSPARC-I UltraSPARC-II":

    6.9.3 Context Registers
    ...
    Note: A STXA to the context registers requires either a MEMBAR #Sync,
    FLUSH, DONE, or RETRY before the point that the effect must be visible
    to data accesses. Either a FLUSH, DONE, or RETRY is needed before the
    point that the effect must be visible to instruction accesses:
    MEMBAR #Sync is not sufficient. In either case, one of these instructions
    must be executed before the next translating or bypass store or load of
    any type. This is necessary to avoid corrupting data. 

Does it make sense here?
-- 
Juergen Hannken-Illjes - hannken@eis.cs.tu-bs.de - TU Braunschweig (Germany)