tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Proposal: kmem_valloc [was: Re: raspberry pi panic 7.0_BETA after install fs resize]



Le 09/11/2014 09:06, Maxime Villard a écrit :
Le 08/11/2014 23:28, Jean-Yves Migeon a écrit :
My suggestion was to enable the size setters/getters by default, and
possibly #ifdef part of the code that would be deemed unsuitable for
!diagnostic kernels.

Yes, but still, what's the point of giving a size then? If the kernel
knows the size, we can remove kmem_free's size argument, otherwise it
is inconsistent.

You have to preserve the API. There may be consumers of this outside src tree so you can't break it bluntly.

And then, as I said, we lose the memory optimisation: one more kmem page
is allocated to hold the size.

True.

My point is:
  - KMEM_SIZE is only enabled on DIAGNOSTIC.
  - We need a similar way of holding the size by default - and no size
    argument to _free.
  - Problem: if we do it by default in kmem_alloc - which, if I
    understand correctly, is your suggestion - we lose the memory
    optimisation.
  - Solution: let's use malloc!
  - Problem: malloc is deprecated, and dead.
  - Solutions:
      - implement kmem_valloc/kmem_vfree
    or
      - undeadify malloc

Please don't reinstate malloc.

IMHO if you want to keep riding that route, I would:
- enable by default the part of the code that tracks size allocations;
- rename KMEM_SIZE to something like KMEM_SIZE_DIAGNOSTIC and use it to
#ifdef checks that might have performance penalties (I can't see any but
maybe I overlooked something).

You know, I already struggled to get KMEM_SIZE enabled on DIAGNOSTIC.

Then it is. To vanquish without peril is to triumph without glory :)

Having an empty memory region
serves no purpose in kernel and will put pressure on the "kmem-8" cache
for no benefit.

Yes; but remember that kmem_alloc(0) is rare - well, it should be rare -
so there won't be such a huge "pressure" on kmem-8.
 >
Returning NULL would be even worse from a security standpoint and poses
problems for documentation. What does it mean to kmem_valloc(0,
KM_SLEEP)? A successful allocation but with an invalid pointer? Huh.

A successful allocation with a valid pointer, but a pointer that must
not be written to nor read from.

This asks for real troubles with 0. It is way too close to integer
underflow for unsigned values (nobody expects a negative sized buffer).

I understand what you say; but as your example shows, the allocation is
variable-sized, so the use will be variable-sized too.

Verily I would agree with you if I hadn't seen many bugs like:

     int count = SCARG(uap, misc);
...
     /* Make sure userland cannot exhaust kernel memory */
     if ((size_t)count > (size_t)uvmexp.nswapdev)
         count = uvmexp.nswapdev;
...
     ksep_len = sizeof(*ksep) * count;
     ksep = kmem_alloc(ksep_len, KM_SLEEP);

In this example, count could be zero, and it just crashed. But ksep was
used depending on count, it was not touched when count=0.

kmem_alloc(0) can uselessly panic the system while there's no bug at
all. kmem_valloc(0) wouldn't have crashed the system.

Bound checks should be made on both side of the domain for syscalls. Yes it is a programming mistake.

Honestly I prefer a CVE about a local DoS that crashes the kernel than one where a NULL or a zero-sized buf gets exploited (either by arbitrary memory rewrites or double-free exploits). This can change the CVE from a DoS to a kernel compromise.

Technically the pointer returned by a zero allocation is not really "unusable": you have at least 8 bytes allocated that contains the size, but nothing more would fit in. This calls for a number of off-by-one bugs.

And finally, remember that malloc(0) does not panic precisely because
a header is allocated. malloc(0) returns a valid pointer that must not
be touched. And that's why there's never a zero check before
mallocating. Play a bit with NXR and you'll see what I mean.

If you think kmem_valloc(0) is a bad idea, then malloc(0) is a bad idea
too, and we would have to fix them all.

If we go down this route for 0 size buffers, I would prefer to have a page where the 8-bytes size is found at the end of a read-only page and the next VA just after this page be unmapped from virtual memory; so that callers may still use a somewhat valid pointer to a memory region (which is not NULL and found in kernel VA space), but reading or writing to it would lead to fault.

Then return that specific address for all those that attempt to alloc for 0 and log the offending caller in dmesg.

> Anyway, I'll start the malloc cleanup, unless anyone disagrees.

No one will :)

Cheers,

--
Jean-Yves Migeon


Home | Main Index | Thread Index | Old Index