tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: sysmon_envsys(9)



On Monday 25 August 2008 04:45:10 Paul Goyette wrote:
> On Mon, 25 Aug 2008, Quentin Garnier wrote:
> > On Sun, Aug 24, 2008 at 10:16:08PM +0000, Paul Goyette wrote:
> >> Several people have commented on various aspects of the current
> >> sysmon_envsys(9) implementation and their desire to see it re-done for
> >> NetBSD 5.0.  Many of these comments have been along the lines of "it
> >
> > No.  Definitely not for 5.0.  This is 6.0 stuff.
>
> Yeah - typo.  Definitely 6.0.  :)
>
> >> started out good but just grew to be too unwieldly."
> >>
> >> I've been tasked with getting sysmon_envsys(9) back under control.  I'd
> >> appreciate any specific examples of this "bloat", or suggestions about
> >> what's wrong with, or lacking in, the current implementation. 
> >> Hopefully, what I come up with will be more useable without any of the
> >> excess baggage.
> >
> > While the API is probably better documented that the average kernel API,
> > its inner workings are less documented.  There are situations where it
> > is too clever for its own good:  for instance, it can get in the way of
> > a slow-access ACPI embedded controller.  We might have done better since
> > I last dared try it, but on one of my laptop I would get crashes or other
> > issues when running xbattbar because its queries added to the monitoring
> > thread in the kernel would either overwhelm the EC or more mundanely hit
> > locking bugs within ACPI-CA.  One thing this example shows is that
> > device drivers probably don't have enough control over how they can be
> > accessed through the current API.
>
> Thanks for discrete example.

The envsys infrastructure also misses handling for hw error cases.

Some examples, I would consider as an hw error to handle:

- temperatures in critical ranges
- FANs show symptoms or typical values that they will fall out
- Battery low
- Battery health too low
- Battery too hot

Error handling depends on the cases and some ideas are:

- Inform user/sysadmin to replace a certain component
- Automatic use of powermanagement infrastructure/features to lower down 
(critical) temperatures.

Christoph


Home | Main Index | Thread Index | Old Index