NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/54994: Critical bug in uarea_poolpage_alloc() for archs with __HAVE_CPU_UAREA_ROUTINES
The following reply was made to PR kern/54994; it has been noted by GNATS.
From: Rin Okuyama <rokuyama.rk%gmail.com@localhost>
To: Jason Thorpe <thorpej%me.com@localhost>
Cc: Nick Hudson <nick.hudson%gmx.co.uk@localhost>, kern-bug-people%netbsd.org@localhost,
gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost, gnats-bugs%netbsd.org@localhost
Subject: Re: kern/54994: Critical bug in uarea_poolpage_alloc() for archs with
__HAVE_CPU_UAREA_ROUTINES
Date: Mon, 2 Mar 2020 09:03:39 +0900
On 2020/02/27 7:13, Jason Thorpe wrote:
>> On Feb 26, 2020, at 7:13 AM, Rin Okuyama <rokuyama.rk%gmail.com@localhost> wrote:
>>
>> Certainly. Then, what should we do?
>>
>> Until now, we've learned:
>>
>> (1) uarea_poolpage_alloc() can fall back into uvm_km_alloc():
>>
>> https://nxr.netbsd.org/xref/src/sys/uvm/uvm_glue.c#269
>>
>> This does not work if low-level routines need physically
>> contiguous (i.e., direct-mapped) pages for u-area.
>>
>> (2) However, all ports with __HAVE_CPU_UAREA_ROUTINES actually do
>> *not* need contiguous u-area anymore, as far as we can see.
>
> AFAIK, they *never* did. Certainly, Alpha does not require a physically-contiguous u-area, neither does x86. Heck, neither does MIPS, assuming wired TLB entries are used to keep the kernel stack mapped. A physically contiguous u-area is ONLY required if you are using a direct-mapped segment to provide the address of the u-area to the CPU.
OK
>> (3) Unfortunately, (2) does not mean that fallback of (1) is safe.
>> If some ports, that need direct-mapped u-area, bump USPACE from
>> 1 to 2 (or more), fallback of uvm_km_alloc() results in memory
>> corruption. This is what we observed on powerpc/ibm4xx.
>>
>> So, we have some options to do:
>>
>> (a) Add MD flag to forbid fallback of uvm_km_alloc().
>>
>> Or if this seems too much,
>>
>> (b) Leave some comments in uarea_poolpage_alloc().
>>
>> Thoughts?
>
> We need to understand why the fallback fails on the platforms where it does fail. The following statements should all be true:
>
> 1- If physically-contiguous pages for the u-area can be allocated and mapped with a direct-mapped segment, we should be able to use that.
>
> 2- If phusically-contiguous pages for the u-area cannot be allocated, then the system should be able to use a u-area that is virtually mapped but not physically contiguous.
>
> (2) used to be the way the system always worked for UPAGES > 1.
As far as I can see, all archs except for powerpc/ibm4xx satisfy both
(1) and (2). (More precisely, they seem not to requires direct-mapped
memory for u-area.)
For ibm4xx, the external interrupt handler uses kernel stack before
enabling translation by MMU:
https://nxr.netbsd.org/xref/src/sys/arch/powerpc/ibm4xx/trap_subr.S#INTR_SAVE
I managed to enable MMU before stack manipulation, but it causes kernel
panic due to TLB miss in the interrupt handler (see details below).
Thanks,
rin
Details:
By enabling MMU before using kernel stack in the interrupt handler,
kernel panic occurs when UPAGES == 2 and __HAVE_FAST_SOFTINTS. This is
due to TLB miss in the 2nd page of u-area.
However, I do not understand the situation yet; (a) why such a TLB miss
does not results in kernel panic for other exception handlers, that
already enable MMU in a similar manner:
https://nxr.netbsd.org/xref/src/sys/arch/powerpc/ibm4xx/trap_subr.S#FRAME_SETUP
and (b) why not without __HAVE_FAST_SOFTINTS.
Seems like a problem in __HAVE_FAST_SOFTINTS v.s. powerpc/ibm4xx.
This is not a very urgent matter, because there is no problem with
UPAGES == 1 for ibm4xx. But should get fixed, of course.
Home |
Main Index |
Thread Index |
Old Index