tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: TLB tiredown by ASID bump
On Jan 5, 2011, at 9:36 PM, Toru Nishimura wrote:
> Matt Thomas made a comment;
>
>> The ASID generational stuff has a downside in that valid entries will be
>> thrown away. For mips (and booke) I use
>> a different algorithm which eliminates the overhead of
>> discarding all the TLB entries when you run out of ASIDs.
>
> It's a good move to pursue efficent ASID management
> schemes since it's the key area for runtime VM/TLB activity.
>
> Matt points loosing valid entries is a problem when ASID
> generation is going to get bumped. I think, however, it'd be
> forgiven that tbia() operation, to discard all entries but global
> or locked, discards "live entries" since TLB size is still small
> enough. Some CPU architectures do it with a single
> special instruction or others do at-most 64 time loop to discard.
> TLB is a cache for VA->PA translation and the management
> scheme always provokes "efficiency v.s. correctness"
> arguments. It's a matter of implementation tradeoff, I believe.
>
> BTW, how do you approach to implement a remote TLB
> shootdown?
It depends on the reason for the shootdown. Might as include the
comments of pmap_tlb.c here:
/*
* Manages address spaces in a TLB.
*
* Normally there is a 1:1 mapping between a TLB and a CPU. However, some
* implementations may share a TLB between multiple CPUs (really CPU thread
* contexts). This requires the TLB abstraction to be separated from the
* CPU abstraction. It also requires that the TLB be locked while doing
* TLB activities.
*
* For each TLB, we track the ASIDs in use in a bitmap and a list of pmaps
* that have a valid ASID.
*
* We allocate ASIDs in increasing order until we have exhausted the supply,
* then reinitialize the ASID space, and start allocating again at 1. When
* allocating from the ASID bitmap, we skip any ASID who has a corresponding
* bit set in the ASID bitmap. Eventually this causes the ASID bitmap to fill
* and, when completely filled, a reinitialization of the ASID space.
*
* To reinitialize the ASID space, the ASID bitmap is reset and then the ASIDs
* of non-kernel TLB entries get recorded in the ASID bitmap. If the entries
* in TLB consume more than half of the ASID space, all ASIDs are invalidated,
* the ASID bitmap is recleared, and the list of pmaps is emptied. Otherwise,
* (the normal case), any ASID present in the TLB (even those which are no
* longer used by a pmap) will remain active (allocated) and all other ASIDs
* will be freed. If the size of the TLB is much smaller than the ASID space,
* this algorithm completely avoids TLB invalidation.
*
* For multiprocessors, we also have to deal TLB invalidation requests from
* other CPUs, some of which are dealt with the reinitialization of the ASID
* space. Whereas above we keep the ASIDs of those pmaps which have active
* TLB entries, this type of reinitialization preserves the ASIDs of any
* "onproc" user pmap and all other ASIDs will be freed. We must do this
* since we can't change the current ASID.
*
* Each pmap has two bitmaps: pm_active and pm_onproc. Each bit in pm_active
* indicates whether that pmap has an allocated ASID for a CPU. Each bit in
* pm_onproc indicates that pmap's ASID is active (equal to the ASID in COP 0
* register EntryHi) on a CPU. The bit number comes from the CPU's cpu_index().
* Even though these bitmaps contain the bits for all CPUs, the bits that
* correspond to the bits belonging to the CPUs sharing a TLB can only be
* manipulated while holding that TLB's lock. Atomic ops must be used to
* update them since multiple CPUs may be changing different sets of bits at
* same time but these sets never overlap.
*
* When a change to the local TLB may require a change in the TLB's of other
* CPUs, we try to avoid sending an IPI if at all possible. For instance, if
* are updating a PTE and that PTE previously was invalid and therefore
* couldn't support an active mapping, there's no need for an IPI since can be
* no TLB entry to invalidate. The other case is when we change a PTE to be
* modified we just update the local TLB. If another TLB has a stale entry,
* a TLB MOD exception will be raised and that will cause the local TLB to be
* updated.
*
* We never need to update a non-local TLB if the pmap doesn't have a valid
* ASID for that TLB. If it does have a valid ASID but isn't current "onproc"
* we simply reset its ASID for that TLB and at the time it goes "onproc" it
* will allocate a new ASID and any existing TLB entries will be orphaned.
* Only in the case that pmap has an "onproc" ASID do we actually have to send
* an IPI.
*
* Once we determined we must send an IPI to shootdown a TLB, we need to send
* it to one of CPUs that share that TLB. We choose the lowest numbered CPU
* that has one of the pmap's ASID "onproc". In reality, any CPU sharing that
* TLB would do, but interrupting an active CPU seems best.
*
* A TLB might have multiple shootdowns active concurrently. The shootdown
* logic compresses these into a few cases:
* 0) nobody needs to have its TLB entries invalidated
* 1) one ASID needs to have its TLB entries invalidated
* 2) more than one ASID needs to have its TLB entries invalidated
* 3) the kernel needs to have its TLB entries invalidated
* 4) the kernel and one or more ASID need their TLB entries invalidated.
*
* And for each case we do:
* 0) nothing,
* 1) if that ASID is still "onproc", we invalidate the TLB entries for
* that single ASID. If not, just reset the pmap's ASID to invalidate
* and let it allocated the next time it goes "onproc",
* 2) we reinitialize the ASID space (preserving any "onproc" ASIDs) and
* invalidate all non-wired non-global TLB entries,
* 3) we invalidate all of the non-wired global TLB entries,
* 4) we reinitialize the ASID space (again preserving any "onproc" ASIDs)
* invalidate all non-wried TLB entries.
*
* As you can see, shootdowns are not concerned with addresses, just address
* spaces. Since the number of TLB entries is usually quite small, this avoids
* a lot of overhead for not much gain.
*/
Home |
Main Index |
Thread Index |
Old Index