Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: [PAE] minor virtual memory i386 code rework

On Tue, 7 Sep 2010 18:43:46 +0100, David Laight <> 
> On Mon, Sep 06, 2010 at 10:24:45PM +0200, Jean-Yves Migeon wrote:
>> Hi list,
>> I am currently in the process of adding PAE support within kvm(3). As
>> such, I am proposing a patch (see attached) to the "review before
>> commit" process:
>> - it makes paddr_t a 64 bits entity for i386 userland, whether PAE
>> support was compiled in, or not (only affects kvm_i386).
> I thought you needed to solve the problem of loadable kernel modules?

It will get fixed when kernel will move to 64 bits paddr_t/bus_addr_t.
That, or we will have to provide a separate set of modules for PAE.

> I think that means that any kernel structures that might be referenced
> by a loadable module must contain space for a 64bit paddr_t.
> This would include kernels that are build without PAE support.

Currently, I am not too keen on submitting patches to bluntly set paddr_t
to 64 bits for non-PAE i386. This may introduce performance regressions I
cannot afford to trace presently, and it may cripple entire systems if not
done cautiously (pool_cache(9) depends on paddr_t, which is used pretty
much everywhere now).

> Internally a non-PAE kernel will probably want to use 32bit arithmetic
> for any calculations.
> This might mean using a structure for paddr_t !

paddr_t is used by "public" interfaces (bus_dma/space, pool_cache, ...) I
think it is a bad idea to use a struct for paddr_t.

While the type becomes obscure through typedefing, the 32 bits arithmetic
optimization shall remain hidden in the kernel, and even more, only in
sensitive areas like pmap(9) and bus_dma(9)/bus_space(9) code. From the
"outside" (eg. for modules, or userland), they should stay with 64 bits
"unsigned long long" bus_addr_t and paddr_t. IMHO, this is less
error-prone, and modules shall be able to manipulate addresses without
having to rely on dirty macro quirks just because "oh, I just tried
PAGE_SHIFTing paddr_t, but it does not seem to work as expected because its
a struct".

The struct is one possibility; however, I'd like to try to work with
another solution first, similar to the one used for ELF32/64, where the
same code is compiled twice but with different options. I suppose it is
easier for the compiler to figure out optimization there, and may even
allow unloading parts of the pmap/bus_dma that are not used by the current
mode (pie in the sky, this will need MD work...).

Jean-Yves Migeon

Home | Main Index | Thread Index | Old Index