tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Avoid UB in pslist.h (NULL + 0)



> Date: Sun, 22 Mar 2020 03:30:56 +0100
> From: Kamil Rytarowski <n54%gmx.com@localhost>
> 
> On 22.03.2020 01:50, Taylor R Campbell wrote:
> > So far, after several weeks of discussion, nobody has presented a case
> > that there is a credible thread of a compiler actually misbehaving in
> > this scenario.
> 
> There are no public declarations (that was requested on a local ML) but
> according to my IRC talks, there were plans to start optimizing on NULL
> + 0 operations, but is was/is considered as a chicken-egg issue. There
> is a time span now to alert users and

Perhaps it is not a chicken-egg issue, but pointless abuse of leeway
in the standard.  So far the public declarations are that Clang
developers realized they _could_ abuse the leeway this way but they
_chose not to_; that essentially everyone involved sees such
`optimization' as abuse; that there are no public reasons stated for
why the C standard diverges from the C++ standard on this note.

If you have contrary information, please cite it specifically.

> Both languages can be synced here as the incompatibility was revealed
> and discussed. Personally I wouldn't be surprised to see adoption of the
> C behavior as nullptr + 0 is invalid in C++ (and it will be likely
> invalid in future C).

The fragment nullptr + 0 is invalid for an unrelated reason: nullptr
is a pointer to an incomplete type.  We've gone over this already.

It's not helpful to keep bringing it up: at best it's a distraction,
and at worst it gives the impression you might not understand basic
aspects of C and C++ -- an impression which, if true, would make it
all the more important that you _not_ go around tweaking things to
appease sanitizers before discussing and understanding them.

> So far in reply I get 'just noise from the tool' so maybe my messages
> were not clear enough and reaching authorities (from the clang community
> and C committee) not credible enough.

You successfully presented a case that it is technically undefined
behaviour.  This is not in dispute -- a month ago I cited the specific
place where the C standard neglects to define it[*] -- so I don't
understand why you continue to argue that case.  Inline asm is
technically undefined by the standard too but it would be worse than
useless if ubsan warned about every instance of that.

[*] https://mail-index.NetBSD.org/tech-kern/2020/02/25/msg026105.html

> > (b) Change how we invoke ubsan and the compiler by passing
> >     -fno-delete-null-pointer-checks to clang.  joerg objected to this
> >     but I don't recall the details off the top of my head; joerg, can
> >     you expand on your argument against this, and which alternative
> >     you would prefer?
> 
> I tried to enable -fno-delete-null-pointer-check for RUMPKERNEL, but I
> was asked to revert... after first getting acknowledge from you that it
> is a good idea...

This is an inaccurate representation of what happened.  What I said is
(https://mail-index.NetBSD.org/tech-kern/2020/03/08/msg026125.html):

  `Adding -fno-delete-null-pointer-checks for clang too sounds
   sensible to me in general, but please check with joerg first.  It
   remains unclear to me that it's necessary here.'

I asked you to revert because:
(a) the discussion was still ongoing without a conclusion,
(b) it remained unclear whether the change was necessary,
(c) you failed to check with joerg like I asked, and
(d) joerg objected.

I appreciate that sometimes it takes longer than you or I might like
to get a clear explanation out of joerg, but it is also fatiguing to
have to correct misrepresentations of positions in long meandering
threads in order to ensure changes you are itching to make -- or have
already made despite ongoing discussion -- are justified and correct.

> > (c) Change how we invoke ubsan, but just ubsan -- not the compiler.
> 
> UBSan is an integral part of a compiler.

What I meant is:  Change the options we pass for builds with the
sanitizer enabled, but not the options we pass for normal builds.  In
other words, either pass _both_ -fsanitize=undefined and
-fno-delete-null-pointer-checks (if MKSANITIZER=yes), or _neither_ of
them (if MKSANITIZER=no).

> If we introduce code that relies on UB assumptions it is rather opposite
> of comprehensible code. Handling UB imposes handling corner-cases that
> is rather a good style of programming.

The kernel relies on UB assumptions everywhere, because it's a _part_
of the C implementation -- large swaths of the kernel are responsible
for driving a physical machine to implement the C abstract machine.

As the maintainers of the kernel, we have to distinguish the UB that
actually may have bad consequences, not blindly reject all UB because
the letter of the standard doesn't define it in the C abstract
machine.

> > (f) Ditch ubsan.  Does it have enough _true_ alarms, detecting actual
> >     bugs, to make it worth keeping, or are we wasting a lot of time to
> >     appease a tool that isn't actually giving us much value?
> 
> 1. Signed integer overflows (ignoring the signedness bugs; I recall at
> least 3 real bugs and one set of bugs generating garbage results)
> 2. Alignment issues.
> 3. Invalid VLA bound.
> 4. Invalid shifts (base out of valid range)
> 5. Pointer overflows.

Thanks.  Can you point to some of these bugs and fixes so we can
document ubsan's record of utility?

> >     Other issues that I've seen raised by ubsan in practice, from
> >     memory: memcpy(NULL,NULL,0), floating-point division by zero,
> >     alignment issues.  The first two are non-issues in real
> >     implementations; the last suggests that maybe the tests we're
> >     running on, e.g., sparc64 aren't very extensive.
> 
> 1. memcpy(NULL, NULL, 0): GCC is a real implementation.
> 
> http://netbsd.org/~kamil/memcpy-ub.c

We already use -fno-delete-null-pointer-checks with GCC, so this is
not relevant to us.  We've gone over this already.

> 2. We don't use extensively floating-point and definitely not in the
> kernel space. Currently there are no reports for it. I don't know what
> is the expected result for division by zero and why would we call it a
> well defined code.

This came up in another project[1], not in NetBSD.  IEEE 754
floating-point division by zero is well-defined, and the vast majority
of numerical code of the past quarter century relies on it.

Technically it may be undefined behaviour in C, yet IEEE 754
arithmetic semantics is _more reliable_ than signed integer arithmetic
semantics in practice, so ubsan's warnings about it were
counterproductive in the real world; in the end it looks like clang
may have just disabled this misfeature in ubsan because of that[2].

(Yes, here in NetBSD we kinda support VAX floating-point arithmetic,
but I haven't seen any evidence that ubsan has helped us with
maintaining VAX support.)

[1] https://trac.torproject.org/projects/tor/ticket/29527
    https://trac.torproject.org/projects/tor/ticket/29528

[2] https://bugs.llvm.org/show_bug.cgi?id=19535

> >     Maybe someone can present an argument that ubsan is worth the pain
> >     of patching and working around false alarms and merging local
> >     changes into upstream updates.  It's not a priori clear to me --
> >     I'm not proposing that we ditch ubsan; I'm just saying I'm not the
> >     one to argue the case that we should use it as a bug-finding tool
> >     and take on the costs of working around its many false alarms as a
> >     rule.
> 
> Possibly we talk about different things under the same of a 'false alarm'.
> 
> If I deduce your definition correctly as 'usually works', we could stop
> patching e.g. use-after-free our write out of bounds as that code
> usually works.

You did not deduce this correctly.

Nobody seriously disputes that use-after-free or buffer overruns are
bugs in general.  And I'll be the first to say that certain classes of
bugs -- use-after-free and buffer overruns included -- can be treated
as security issues whether or not an exploit is demonstrated.  E.g.:
https://googleprojectzero.blogspot.com/2014/08/the-poisoned-nul-byte-2014-edition.html

But so far I haven't seen any serious reason to treat adding zero to a
null pointer as a bug.


I asked you to present an argument about a set of options (a)-(f)
about how to proceed.  I am not totally sure what your argument was,
but it sounds like you objected to (f) [ditch ubsan] and favoured (b)
[pass -fno-delete-null-pointer-check always] or (d) [patch ubsan]; it
is unclear to me whether you favour, object to, or have no position on
(a) [status quo -- live with the warnings], (c) [pass
-fno-delete-null-pointer-check only when sanitizers are enabled] or
(e) [tweak our code to suppress false alarms].  If that's not
accurate, please clarify.


Home | Main Index | Thread Index | Old Index