tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: NULL pointer arithmetic issues



> Date: Mon, 24 Feb 2020 11:42:01 +0100
> From: Kamil Rytarowski <n54%gmx.com@localhost>
> 
> Forbidding NULL pointer arithmetic is not just for C purists trolls. It
> is now in C++ mainstream and already in C2x draft.
> 
> The newer C standard will most likely (already accepted by the
> committee) adopt nullptr on par with nullptr from C++. In C++ we can
> "#define NULL nullptr" and possibly the same will be possible in C.
> 
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2394.pdf
> 
> This will change all arithmetic code operating on NULL into syntax error.

Arithmetic on bare NULL is already an error, flagged by the options
-Wpointer-arith -Werror which we already use, and arithmetic on the
proposed nullptr will remain so.  This question is not about that, or
about syntax.

The question is whether it is realistic to imagine that a compiler we
will ever use to build the kernel -- particularly with the option
-fno-delete-null-pointer-checks as we already use to build the kernel
with gcc -- will actually meaningfully distinguish the fragments

	char *x = NULL;
	return x;

and

	char *x = NULL;
	return x + 0;

Will two programs that differ only by this fragment actually behave
differently on any serious C implementation we use in NetBSD, ignoring
the pedantry of ubsan?

(The question is the same if you substitute the proposed nullptr for
NULL; it's about the meaning of + on a null pointer, not whether the
program is syntactically written with the letters `NULL' or
`nullptr'.)


The second program technically has undefined behaviour because in,
e.g., C99 6.5.6 `Additive operators', the meaning of + is defined on
pointer/integer operands only when the pointer is to an object in an
array and the sum stays within the array or points one past the end --
in other words, there's nothing in C99 formally defining what x + 0
means when x is a null pointer.

Why is the standard written this way?  I surmise that it's because
technically there exist implementations such as Zeta-C where a
`pointer' is not simply a virtual address in a machine register but
actually a pair of a Lisp array and an index into it.

NetBSD does not run on such implementations.  Corners of the standard
that serve _only_ to accommodate such implementations are not relevant
to NetBSD on their own.


The standard is also technically written so that a null pointer is not
necessarily stored as all bits zero in memory, so

	char *x;
	memset(&x, 0, sizeof x);
	return x;

is not guaranteed to return a null pointer.  However, NetBSD only runs
on C implementations where it actually is guaranteed to return a null
pointer, and we rely on this pervasively.  If we make _only_ the
assumptions that the standard formally guarantees, then ubsan would be
right to object that

	char *x;
	memset(&x, 0, sizeof x);
	return x == NULL ? 0 : *(char *)x;

has undefined behaviour.  But in NetBSD this is guaranteed to return 0
and so if ubsan flagged it we would treat that as a useless false
alarm that detracts from the value of ubsan as a tool.


If you can present a compelling argument that C implementations which
are _relevant to NetBSD_ -- not merely technically allowed by the
letter of the standard like Zeta-C -- will actually behave differently
from how I described, please present that.  Otherwise please find a
way to suppress the false alarm in the tool so it doesn't waste any
more time.

(And please do the same for memcpy(x,NULL,0)/memcpy(NULL,y,0)!)


Home | Main Index | Thread Index | Old Index