Re: faults on copy operations

To: Maxime Villard <max%m00nbsd.net@localhost>
Subject: Re: faults on copy operations
From: Chuck Silvers <chuq%chuq.com@localhost>
Date: Tue, 28 Apr 2020 19:44:08 -0700

On Sun, Apr 26, 2020 at 09:45:04PM +0200, Maxime Villard wrote:
> Le 26/04/2020 à 21:26, Chuck Silvers a écrit :
> > On Sat, Apr 25, 2020 at 09:50:26AM +0200, Maxime Villard wrote:
> >> Small tactical mitigation I want to (re)introduce.
> >>
> >> The copyin and copyout functions (plus their ufetch and ustore variants)
> >> register a specific handler, which causes page faults in kernel mode to
> >> be dismissed as user faults. This way, when userland passes an unmapped
> >> page to copyin for example, the kernel will page-fault but will consider
> >> it as a non-fatal user fault.
> >>
> >> The problem I see, is that when discarding such faults, the kernel doesn't
> >> check the VA that faulted, and this hides certain faults that should be
> >> considered as fatal: those occuring on kernel pages, betraying a bug in the
> >> caller of the copy functions.
> >>
> >> Suppose there is a crazy syscall that passes a user-controllable length to
> >> copyout (something I've encountered already in the past). An attacker can
> >> do a gigantic copyout and retrieve kernel memory until a page fault occurs
> >> on an unmapped kernel page; the page fault will actually be dismissed, and
> >> the syscall will just error out, meaning that the system remains alive and
> >> the attacker remains alive too. With simple forms of heap re-modelling, and
> >> by repeating this operation, the attacker could manage to dump strictly all
> >> of the kernel memory.
> >>
> >> I want to apply this change [1], that causes the handler to panic if the
> >> fault came from a kernel page. This isn't a very excellent mitigation, but
> >> at least, it will expose bogus memory accesses and make it less easy for
> >> severe buffer overflows to go unpunished.
> >>
> >> The change is based on KASSERTs I initially added in 2016:
> >> 	https://mail-index.netbsd.org/source-changes/2016/09/16/msg077746.html
> >> But which I quickly reverted without really understanding what was wrong
> >> with them. In 2018 I realized that CR2 was getting accessed in non-page
> >> faults, which was a bug:
> >> 	https://mail-index.netbsd.org/source-changes/2018/02/25/msg092488.html
> >> In retrospect I see that my KASSERTs were firing because of this bug. Now
> >> I want to re-introduce these KASSERTs as panics.
> >>
> >> Maxime
> >>
> >> [1] https://m00nbsd.net/garbage/x86/copy-trap.diff
> > 
> > 
> > Improving the detection of buggy copyin/out()s would be great, however
> > any copyin/out where the both the kernel and user side are pageable
> > can legitimately fail on either side.  The examples of this that I can
> > think of right now are ubc_uiomove() and copyinargs().  Please make sure
> > that we recover from such legitimate fault failures than treat a failure
> > in such a situation as a bug.
> 
> I am aware of that.
> 
> My patch enforces the policy when the fault couldn't be resolved, AND for a
> reason different than ENOMEM. The cases you mentioned therefore do not get
> classified as fatal faults.
> 
> Maxime

Kernel faults can legitimately fail for more reasons than just ENOMEM,
for example they can fail with EIO if a page cannot be read back
from a storage device.  Other error codes might also be legitimate,
I haven't gone through all of them.

-Chuck

Follow-Ups:
- Re: faults on copy operations
  - From: Maxime Villard

References:
- faults on copy operations
  - From: Maxime Villard
- Re: faults on copy operations
  - From: Chuck Silvers
- Re: faults on copy operations
  - From: Maxime Villard

Prev by Date: Re: faults on copy operations
Next by Date: 8.0 booter: chain-boot sd0 if present?
Previous by Thread: Re: faults on copy operations
Next by Thread: Re: faults on copy operations
Indexes:

Home | Main Index | Thread Index | Old Index