tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Proposal: kmem_valloc [was: Re: raspberry pi panic 7.0_BETA after install fs resize]



Hi,
there's something terribly sad with the kmem API: kmem_free takes
a size argument. It has I think two major drawbacks:

 - the fact that the caller needs to know the allocated size often
   leads to strange hacks in structures, where fields are added
   just to hold allocated sizes. Accumulating hacks sometimes leads
   to great bugs - see NetBSD-SA2014-014 - and omissions - like in
   ffs_unmount(), currently.
 - the fact that the allocator does not "know" the allocated size
   is efficient from a memory pov - it doesn't need more memory
   space to hold the size -, but makes kmem_alloc(0) illegal. That
   is something that could be more or less fixed by re-indexing the
   kmem cache array, but then one more 8-byte cache would be
   necessary, and we probably end up losing the memory optimisation.

We currently have malloc()/free(), which at a first glance would
be the solution when too many hacks take place. Moreover malloc(0)
is not illegal: in all cases a header is allocated. But malloc is
deprecated. Let's give a glance at the man page and the code:

 - malloc_roundup() and malloc_type_setlimit() are documented in
   man while they don't even exist.
 - malloc_type_attach(), malloc_type_detach(), MALLOC_DEFINE_LIMIT(),
   MALLOC_JUSTDEFINE_LIMIT(), MALLOC_DEFINE(), MALLOC_JUSTDEFINE()
   and MALLOC_DECLARE() are dead ops.
 - <sys/mallocvar.h> and <sys/malloc.h> are still included in many
   places where no call is made to either malloc() or free().
 - kern_malloc() allocates memory through kmem_alloc, therefore the
   M_CANFAIL flag is lost, which means that malloc() may behave
   unexpectedly.

You can understand that the malloc API is dead, and that no attempt
has been made to switch correctly to kmem_alloc, or to re-implement
something clean.

That's what motivates my proposal: creating kmem_valloc and kmem_vfree,
to definitely get rid of malloc, and have another set of functions
that can perform allocations/deallocations without needing a size
argument. The 'v' stands for "variable".

The implementation is simple:

 - allocate 8 more bytes to hold this structure:
	struct kmem_header {
		size_t		size;
	} __aligned(KMEM_ALIGN);
   and put the allocated size in the field. It is KMEM_ALIGN-aligned.
   Then return (header + 1).
 - when freeing: do (given_pointer - 1), and call kmem_free() with
   the size in the field.

That's mostly what kern_malloc() does, but it is consistent and
sometimes consumes less memory - kern_malloc may allocate one more
page (PAGE_SIZE).

As some of you may have noticed, some recent Security Advisories
were related to kmem. And now there's this issue in ffs_unmount();
and more bugs will come.

Here is a patch which implements kmem_valloc.

Comments?

Index: kern/subr_kmem.c
===================================================================
RCS file: /cvsroot/src/sys/kern/subr_kmem.c,v
retrieving revision 1.60
diff -u -r1.60 subr_kmem.c
--- kern/subr_kmem.c	22 Jul 2014 07:38:41 -0000	1.60
+++ kern/subr_kmem.c	8 Nov 2014 06:12:44 -0000
@@ -173,6 +173,10 @@
 #define	KMEM_BIG_MAXSIZE	16384
 #define	KMEM_CACHE_BIG_COUNT	(KMEM_BIG_MAXSIZE >> KMEM_BIG_SHIFT)
 
+struct kmem_header {
+	size_t		size;
+} __aligned(KMEM_ALIGN);
+
 static pool_cache_t kmem_cache_big[KMEM_CACHE_BIG_COUNT] __cacheline_aligned;
 static size_t kmem_cache_big_maxidx __read_mostly;
 
@@ -206,9 +210,6 @@
 #endif /* defined(KMEM_REDZONE) */
 
 #if defined(KMEM_SIZE)
-struct kmem_header {
-	size_t		size;
-} __aligned(KMEM_ALIGN);
 #define	SIZE_SIZE	sizeof(struct kmem_header)
 static void kmem_size_set(void *, size_t);
 static void kmem_size_check(void *, size_t);
@@ -406,6 +407,48 @@
 	kmem_intr_free(p, size);
 }
 
+/*
+ * kmem_valloc: allocate wired memory.
+ * => must not be called from interrupt context.
+ */
+
+void *
+kmem_valloc(size_t size, km_flag_t kmflags)
+{
+	struct kmem_header *hd;
+	size_t allocsize;
+	void *p;
+
+	KASSERTMSG((!cpu_intr_p() && !cpu_softintr_p()),
+	    "kmem(9) should not be used from the interrupt context");
+
+	allocsize = size + sizeof(struct kmem_header);
+	if ((p = kmem_intr_alloc(allocsize, kmflags)) == NULL)
+		return NULL;
+	hd = (struct kmem_header *)p;
+	hd->size = allocsize;
+
+	return hd + 1;
+}
+
+/*
+ * kmem_vfree: free wired memory allocated by kmem_valloc.
+ * => must not be called from interrupt context.
+ */
+
+void
+kmem_vfree(void *p)
+{
+	struct kmem_header *hd;
+
+	KASSERT(!cpu_intr_p());
+	KASSERT(!cpu_softintr_p());
+
+	hd = (struct kmem_header *)p;
+	hd -= 1;
+	kmem_intr_free((uint8_t *)hd, hd->size);
+}
+
 static size_t
 kmem_create_caches(const struct kmem_cache_info *array,
     pool_cache_t alloc_table[], size_t maxsize, int shift, int ipl)
Index: sys/kmem.h
===================================================================
RCS file: /cvsroot/src/sys/sys/kmem.h,v
retrieving revision 1.9
diff -u -r1.9 kmem.h
--- sys/kmem.h	5 Feb 2012 03:40:08 -0000	1.9
+++ sys/kmem.h	8 Nov 2014 06:12:44 -0000
@@ -40,6 +40,9 @@
 void *	kmem_zalloc(size_t, km_flag_t);
 void	kmem_free(void *, size_t);
 
+void *	kmem_valloc(size_t, km_flag_t);
+void	kmem_vfree(void *);
+
 void *	kmem_intr_alloc(size_t, km_flag_t);
 void *	kmem_intr_zalloc(size_t, km_flag_t);
 void	kmem_intr_free(void *, size_t);


Home | Main Index | Thread Index | Old Index