tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: add DIAGNOSTIC back to GENERIC/INSTALL



Manuel Bouyer <bouyer%antioche.eu.org@localhost> wrote:
> for the second time (at last) a kernel issue raised the question of
> adding back 'options DIAGNOSTIC' to GENERIC/INSTALL kernels in HEAD.
> Several people agreed tha this would be a good thing.
> 
> So here's the formal question: would someone object if I add back
> 'options DIAGNOSTIC' to i386 and amd64 GENERIC and INSTALL kernels,
> with a comment saying this should be disabled on release branch
> (it would be up to releng to comment it out as part of the release
> process) ?

I have few concerns:

- If we enable DIAGNOSTIC, then we should also enable DEBUG, as it also
  covers many relevant diagnostic checks.

- Alternatively, it should be clearly defined what goes under DEBUG,
  i.e. what is considered a "heavier check".  I think code diverged in
  a way that the difference between DEBUG and DIAGNOSTIC is small.

- Since performance is degraded and -current users concerned about it
  will need to compile their own kernels anyway - I believe LOCKDEBUG
  should be enabled as well.  Perhaps LOCKDEBUG should become a part
  of DEBUG - it is at least clearly a "heavier check". :)

- There MUST be a very clear indication to users - a warning in a visible
  place that the kernel has diagnostic options enabled, and performance
  is significantly degraded.

- Obviously, defined policy/responsibility to disable these options for
  release kernels.  In fact, if we go this way - then options should be
  removed from all MD kernel configs and managed in MI src/sys/conf/std.

> I know that DIAGNOSTIC was commented out so that someone could install
> a HEAD snapshot and run benchmarks out of the box, but as a side effect
> a lot of but are left hidden and only show up when someone tries
> to run a Xen kernel (which still have DIAGNOSTIC). See kern/45051 for
> another one.

Many developers do use these options (e.g. I always enable all options
when developing something), but some bugs just occur rarely.  For example,
at least few developers were running diagnostic kernels, but did not get
the assert reported by drochner@ (also, many developers simply do not
upgrade their kernels that often).  PR/45051 is also a rare case - I have
added that assert in pool(9) subsystem a year ago, exactly for a reason to
get these kind of reports.  Surely many had run diagnostic kernels in a
year time, but it might need specific workload to trigger.


P.S. PR/45051 problem is that bus_dma(9) uses pmap_enter(9), and it can
occasionally happen from interrupt context.  It is kernel-only mapping
i.e. on pmap_kernel().  Is there any reason (apart from historical) why
pmap_kenter_pa(9) is not used?

-- 
Mindaugas


Home | Main Index | Thread Index | Old Index