tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Silly question - a further one : TLB flush

On Sun, Sep 07, 2008 at 12:55:14PM +0200, Jean-Yves Migeon wrote:
> Vincent wrote:
> >According to a recent article published on, Unises' 
> >i386 kernels, although they can address the integrality of the 4 GiB 
> >memory space, both in user- and kernel mode, suffer from a TLB cache 
> >penalty each time a system call (trap) or interrupt is entered in, 
> >because the OS has to switch the MMU between user- and kernel VM 
> >space, and thus flush the TLB before reloading it. This penalty is not 
> >incured by 64-bits kernels, because the whole VM space can safely be 
> >divided between user and kernel space without any overlap, and thus 
> >the TLB can hold "global" VM page translation infos (what 
> >32bit-Windows versions actually do, limiting user memory space to 2 or 
> >3 GiB at most).
> Could you provide me with a link to this article please?
> What follows should be reviewed by gurus; but that's what I understood 
> while browsing through netbsd's code. It may contain mistakes or 
> misinterpretations.
> For i386, netbsd uses a 3GB/1GB memory split and a flat address space 
> (the descriptor table is loaded with a segment starting at address 0 and 
> ending at the 4GB boundary). As a consequence, the MMU contains both 
> user and kernel mappings, which do not require a TLB flush when 
> switching contextes.

The kernel will run in the context of the last process to run in user-mode
until a different process returns to user-mode (or maybe until a copyin/out
has to be done - not sure).

If you run an i386 program under an amd64 kernel the full 4G is available
for mapping user addresses.

> If you want to see what the memory layout is for a specific port, you 
> should read the comments in pmap. They explain it all. IIRC, for 
> windows, they use a 2GB/2GB memory split.
> What you are describing happens for amd64 OSs running under Xen though. 
> x86_64 removed the concept of segmentation, and left two rings, a 
> privileged and unprivileged one, while i386 provides 4 ring levels. 
> Since the hypervisor typically runs in privileged ring (aka ring 0), the 
> guest OS is put in the unprivileged one (ring 3), both user and kernel 
> space. So, you may have to update protections between user and kernel, 
> which requires local TLB flushes.

No sane OS uses more than 2 privelege levels.

> Note that increasing the size and numbers of registers does have 
> penalties, as you have to store them somewhere (stack) when context 
> switching. This penalizes microkernels based OS, since you are 
> frequently switching, compared to bigger, monolith ones.

More have to be saved on interrupt entry, but a context switch or system
call only has to save the caller-saved registers (the rest are already
saved / don't care). This isn't that significant for most CPUs - provided
the hardwar allows the registers to be saved and restored [1].


[1] The starcore DSP is a PITA in this regard, it has a lot of registers
that every interrupt (that doesn't just execute a small code stub) has
to save, some status bits are 'sticky' and the only way to restore them
is to conditionally execute an instruction that sets the flag.  This,
together with the mandatory cycle delays after touching the status
register doesn't lead to quick interrupt entry/exit!

David Laight:

Home | Main Index | Thread Index | Old Index