Port-i386 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[PAE support] patch review

Hi lists,

You will find attached the patch for PAE support [1]; diff is against
latest current. This one is for a last review before commit.

Inspiration, and initial PAE support patch come from Jeremy Morse, whom
I thank for his work. I took the responsibility for updating it to
-current, merge it with some code I had, made some improvements and
bring it closer to port-xen PAE (they now share most of the code, except
the kernel shadow PD, due to Xen's limitation).

FWIW, the benchmarks [2] I ran earlier were all done with this patch; so
far, I have never experienced a crash or panic, but, as always, YMMV.
There are some limitations regarding PAE though, see below.

===== Limitations =====

- I could not test the patch with a system with more than 4GiB of RAM.

- I cannot promise anything with modules. The biggest concern being the
paddr_t size change, mixing !PAE modules with a PAE kernel may lead to
unpredictable results. So far I have been lucky, but I would recommend
building a MONOLITHIC kernel; or rebuild modules with PAE enabled.

===== Comments regarding the patch =====

- reworked locore.S.

- introduce i386_cpu_switch_pmap(), used to switch pmap for the curcpu.
Due to the different handling of pmap mappings with PAE vs !PAE, Xen vs
native, I hid the details within this function. This helps calling it
from assembly, as some features, like BIOS calls, switch to pmap_kernel
before mapping trampoline code in low memory.

WARNING: this function currently *breaks* the amd64 build (obviously,
there is no i386_cpu_switch_pmap() for amd64).

==> this part will be re-written with a cleaner abstraction, in sync
with rmind and his uvmplock branch. But I would like you to review the
rest of the patch.

- some changes in bioscall and kvm86_call, to reflect the above.

- bootinfo_source struct is modified, to cope with paddr_t size change
with PAE (it is not correct to assume that bs_addr is a paddr_t when
compiled with PAE). bs_addrs is now a void * array (in bootloader's code
under i386/stand/, the bs_addrs is a physaddr_t, which is an unsigned long).

- the L3 is "pinned" per-CPU, and is only manipulated by a
reduced set of functions within pmap. To track the L3, I added two
elements to struct cpu_info, namely ci_l3_pdirpa (PA of the L3), and
ci_l3_pdir (the L3 VA). Rest of the code considers that it runs "just
like" a normal i386, except that the L2 is 4 pages long (PTP_LEVELS is
still 2).

- fixes in multiboot code (same reason as bootinfo): paddr_t size
change. I used Elf32_* types, use RELOC() where necessary, and move the
memcpy() functions out of the if/else if (I don't think that sym and str
tables overlap with ELF?).

- 64 bits atomic functions for pmap

- all pmap_pdirpa access are now done through the pmap_pdirpa macro. It
hides the L3/L2 stuff from PAE, as well as the pm_pdirpa change in
struct pmap (it now becomes a PDP_SIZE array, with or without PAE).

- manipulation of recursive mappings ( PDIR_SLOT_{,A}PTEs ) is done via
loops on PDP_SIZE.


Two questions:
- is it ok for everyone to add PAE to ALL kernel? This can raise paddr_t
issues within drivers, as well as some format string errors.

- should I add a "#options PAE" within GENERIC config file, or provide


[1] http://www.netbsd.org/~jym/pae.diff

[2] http://wiki.netbsd.org/users/jym/benchmarks/

Jean-Yves Migeon

Home | Main Index | Thread Index | Old Index