Source-Changes-D archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: CVS commit: src/sys



Ryota Ozaki <ozaki-r%netbsd.org@localhost> writes:

> On Sat, May 11, 2019 at 10:49 AM Greg Troxel <gdt%lexort.com@localhost> wrote:
>> >> I'm sorry for delaying this task.  Finally I have benchmarked a revised patch
>> >> (our benchmarking setups have been out of service for a couple of weeks...).
>> >>
>> >> Performance degradation of IP forwarding is 3%.  Is it acceptable as
>> >> DIAGNOSTIC?
>> >
>> > For DIAGNOSTIC should be fine.
>>
>> I think 3% is too much for DIAGNOSTIC.   DEBUG, sure, and a specific
>> option to turn it on seems fine.  DIAGNOSTIC historically has been only
>> for things that check invariants and assert, such that if you don't mind
>> the crashes when detecting things, there is no performance reason to
>> avoid it.
>
> So we have two options:
> 1. Keep the original (enabled on DIAGNOSTIC)
>   - Its performance impact is negligible
> 2. Migrate to use lwp_specificadata and enable the feature on DEBUG
> (and something)

I am not following two things:

1) You said that IP forwarding speed is reduced by 3%, and also that it
is a negligble performance impact.  In my view, 3% is not negligible.
DIAGNOSTIC should be sufficiently small performance impact that
essentially nobody wants to disable it to get better performance.  I
could see 0.3% as negligible.  But everything adds up, so it's not just
this, but the overall system with and without DIAGNOSTIC.

2) Your option 2 seems to involve two things at once:

  - migration to lwp_specificadata
  - using DEBUG instead of DIAGNOSTIC to control the leak check feature

I do not understand why changing the nature of the implementation is
linked to how it is enabled.

Overall, I think checking features (other than simple KASSSERTs) should
have their own ifdef, so people can individually enable them, and then
additionally there would be "#ifdef DEBUG // #define PSREFLEAKDETECT" or
something like that.

> I prefer to 1. because I want the feature to be enabled by many users

Sure, but many users prefer that their systems not be slowed down, too.
If we add 10 more checks that each cause 3% slowdown, then we have a
real problem.  DIAGNOSTIC has always had the sense that it should turn
undefined behavior into a panic, but not slow the system down, so that
the only reason to avoid it is not liking the panics.

> (I assume that users (of -current) tend to enable DIAGNOSTIC and not DEBUG).

On -current, DIAGNOSTIC is on by default.  When we branch 9, I expect
that it will remain on until very close to release.

I agree that most people do not have DEBUG on.

> If the detector is not enabled, a psref leak appears as a form that is
> difficult to know where the leak occurs; a fatal page fault occurs on
> psref_target_destroy that is a completely different context.  If
> enabled, we can at least know a suspect LWP or softint handler and may
> find a cause in luck.

How often do these leaks happen?  If DIAGNOSTIC without the new code
will assert or crash on a leak, but in a non-specific way, we know they
are fairly rare.  People who encounter them can add the define to enable
the more specific detector and continue running to catch the next leak.

It seems that if leaks are common, then it should be easy to find them
with a few people enabling the new detector.  If they are very rare,
then 3% forwarding speed is a large price to pay for finding them more
specifically the first time, on systems where they are not expected to
happen.

Another possibility  (that I don't advocate) is to set this up so that
it is enabled on DIAGNOSTIC with current, but not enabled with
DIAGNOSTIC on release branches.   I am much more concerned with
performance loss on releaes branches.

Another option is to enable it in current, until the bugs are fixed and
we don't see leaks, and then turn it off.

Please don't take this as criticism of your new leak detector.  We are
getting lots of checking code to look for many kinds of problems.  Some
of it is expensive to varying degrees.  Everyone benefits from having
the bugs fixed once found, and it's hard to draw the line about what
should be enabled by default.


Home | Main Index | Thread Index | Old Index