tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Improvements in amd64



Le 15/05/2016 15:42, Joerg Sonnenberger a écrit :
On Fri, May 13, 2016 at 05:42:55PM +0200, Maxime Villard wrote:
Le 13/05/2016 16:48, Martin Husemann a écrit :
On Fri, May 13, 2016 at 12:53:54PM +0200, Maxime Villard wrote:
  - I took rodata out of the text+rodata chunk, and put it in the data+bss+
    PRELOADED_MODULES+BOOTSTRAP_TABLES chunk [3].

Why?

You are probably assuming something obvious to you, but for folks
not too deep into x86 MMU handling (like me), this sounds like a very
strange thing to do.

Martin


What I wanted to achieve, from the beginning to the end, was mapping
text with RX, rodata with R, and data+bss with RW, and optimize them
with large pages.

That still doesn't make sense to me. I see no real advantage to waste
memory by forcefully splitting text and rodata. I still regulary deploy
virtual machines with 256MB or 512MB memory and wasting a decent chunk
of a large page just for the sake of purity doesn't help. What should be
used is:

- text is mapped with RX and a large pages
- rodata starts in the text segment and does not require flushing the
final large page of text
- any pages of rodata and other read-only content (like .eh_frame) after
the final large page of text are mapped read-only, using normal pages
necessary.

It might be useful to mark firmware images and the like to put them into
.rodata-late or something, to make it more likely that they go into the
less TLB friendly small page part, but that's a secondary optimisation.

I'm fine with having .data aligned to a 2MB boundary, but there should
be only one such alignment in the kernel. In short, I find mapping
read-only data as writeable to be much more harmful than mapping
read-only data as executable.

Joerg


It looks like you didn't understand anything to what I've been talking
about so far.

First, only one 2MB alignment is performed, you just have to look at
amd64/conf/kern.ldscript. My intention was indeed to apply three 2MB
alignments, but I was (and still am) stuck in another area because of
another problem. We apply large pages to rodata and data+bss even if they
are not 2MB-aligned; we do so by using large pages only in the contiguous
2MB-long 2MB-aligned areas within each segments, the rest being mapped
with normal pages.

This "forceful" split between text and rodata was already present before
between text+rodata and data+bss+etc. We are not wasting "more" memory;
in fact, the kernel used to lose necessarily 1MB because of the previous,
bizarre alignment, no matter whether that was needed. Now, with the 2MB
alignment, it may actually be losing less memory.

For example, let's say the text segment is 1.5MB long. Before, rodata
would have started at 1+2.5=3.5MB. Now, rodata starts at 1+roundup(1.5,2)
=3MB. We thus lose less memory.

It's not for the "sake of purity", it's just common sense. And we
obviously are interested in making sure there's no bug in the kernel
that could make it jump to a rodata area.

Finally, rodata is not mapped with RW, it is mapped with R, so I don't
even see what's your final point.

I'll send you in private a drawing that sums up everything.


Home | Main Index | Thread Index | Old Index