[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/54994: Critical bug in uarea_poolpage_alloc() for archs with __HAVE_CPU_UAREA_ROUTINES
The following reply was made to PR kern/54994; it has been noted by GNATS.
From: Rin Okuyama <rokuyama.rk%gmail.com@localhost>
To: Jason Thorpe <thorpej%me.com@localhost>
Cc: Nick Hudson <nick.hudson%gmx.co.uk@localhost>, kern-bug-people%netbsd.org@localhost,
gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost, gnats-bugs%netbsd.org@localhost
Subject: Re: kern/54994: Critical bug in uarea_poolpage_alloc() for archs with
Date: Mon, 2 Mar 2020 09:03:39 +0900
On 2020/02/27 7:13, Jason Thorpe wrote:
>> On Feb 26, 2020, at 7:13 AM, Rin Okuyama <rokuyama.rk%gmail.com@localhost> wrote:
>> Certainly. Then, what should we do?
>> Until now, we've learned:
>> (1) uarea_poolpage_alloc() can fall back into uvm_km_alloc():
>> This does not work if low-level routines need physically
>> contiguous (i.e., direct-mapped) pages for u-area.
>> (2) However, all ports with __HAVE_CPU_UAREA_ROUTINES actually do
>> *not* need contiguous u-area anymore, as far as we can see.
> AFAIK, they *never* did. Certainly, Alpha does not require a physically-contiguous u-area, neither does x86. Heck, neither does MIPS, assuming wired TLB entries are used to keep the kernel stack mapped. A physically contiguous u-area is ONLY required if you are using a direct-mapped segment to provide the address of the u-area to the CPU.
>> (3) Unfortunately, (2) does not mean that fallback of (1) is safe.
>> If some ports, that need direct-mapped u-area, bump USPACE from
>> 1 to 2 (or more), fallback of uvm_km_alloc() results in memory
>> corruption. This is what we observed on powerpc/ibm4xx.
>> So, we have some options to do:
>> (a) Add MD flag to forbid fallback of uvm_km_alloc().
>> Or if this seems too much,
>> (b) Leave some comments in uarea_poolpage_alloc().
> We need to understand why the fallback fails on the platforms where it does fail. The following statements should all be true:
> 1- If physically-contiguous pages for the u-area can be allocated and mapped with a direct-mapped segment, we should be able to use that.
> 2- If phusically-contiguous pages for the u-area cannot be allocated, then the system should be able to use a u-area that is virtually mapped but not physically contiguous.
> (2) used to be the way the system always worked for UPAGES > 1.
As far as I can see, all archs except for powerpc/ibm4xx satisfy both
(1) and (2). (More precisely, they seem not to requires direct-mapped
memory for u-area.)
For ibm4xx, the external interrupt handler uses kernel stack before
enabling translation by MMU:
I managed to enable MMU before stack manipulation, but it causes kernel
panic due to TLB miss in the interrupt handler (see details below).
By enabling MMU before using kernel stack in the interrupt handler,
kernel panic occurs when UPAGES == 2 and __HAVE_FAST_SOFTINTS. This is
due to TLB miss in the 2nd page of u-area.
However, I do not understand the situation yet; (a) why such a TLB miss
does not results in kernel panic for other exception handlers, that
already enable MMU in a similar manner:
and (b) why not without __HAVE_FAST_SOFTINTS.
Seems like a problem in __HAVE_FAST_SOFTINTS v.s. powerpc/ibm4xx.
This is not a very urgent matter, because there is no problem with
UPAGES == 1 for ibm4xx. But should get fixed, of course.
Main Index |
Thread Index |