Port-amd64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[PAE support] Initial patch review



Dear all,

Here is a patch [1] that "ports" PAE support from Xen to GENERIC.

Currently, the patch triggers a double fault some time after boot with dom0 PAE under Xen. As such, it is not yet ready for commit, but I consider it mature enough to ask for an initial review.

It diverges quite a bit from the original patch from Jeremy. Reason is to avoid too many #ifdef's between Xen and non-Xen pmap (all the delightful details are in the comments inside the patch - in short, Xen tracks reference counts to L3 kernel page + we cannot use recursive mappings easily). It consumes 4kB for each CPU attached, instead of each process created.

Here is a summary of what the patch does; any advice on it will be appreciated. I tested it under QEMU; unfortunately, I can't stress test MP code, I am not confident on the -smp flag's capability on a mono-core host.

- principle: the PTP_LEVELS remains at 2. The L3 page is a page allocated per-CPU (below the 4GB boundary, due to %cr3 size limitation), and the kernel keeps track of the allocation through 2 additional struct cpu_info elements:
  - ci_l3_pdir for the virtual address of the PD,
  - ci_l3_pdirpa for its PA counterpart

- context switch with PAE is a matter of editing the 4 L3 entries, not changing %cr3 value (this is the non-PAE situation).

- to reflect this, pm_pdirpa element of struct pmap becomes an array of PDP_SIZE. PDP_SIZE is 1 for non-PAE, and 4 for PAE. I did this to avoid too many #ifdef's in pmap; instead, entering/modifying mappings is a for loop with PDP_SIZE; I suppose that any decent compiler will unroll it when PDP_SIZE is 1.

- PDPpaddr represents the PA that points to lwp0/proc0 PD page. For amd64, this is the L4, for i386 non PAE, this is the L2. For PAE, it still represents the L2, but as a PD of 4 contiguous pages. See its comment in patch.

- some code that used PDPpaddr for %cr3 is replaced with references to pcb_cr3 of lwp0, to make it "compatible" with PAE and non-PAE (kvm86_call and bioscall).

- refactor multiboot code (no need to mix "- KERNBASE" and "RELOC"). bs_addrs of boot_info becomes a (void*), as paddr_t is 64 bits with PAE.

XXX misses support for ephemeral mapping. I am not sure on how to implement it correctly, so pmap_load() uses tlbflush(). Same goes for port-xen. I can revisit it later, after merging this patch + xen-suspend.

Any comments on the modifications there? I would prefer to have all important changes "approved" before starting regression testing.

Cheers,

[1] http://www.NetBSD.org/~jym/pae.diff

--
Jean-Yves Migeon
jeanyves.migeon%free.fr@localhost


Home | Main Index | Thread Index | Old Index