Subject: Re: Making a common API for cpu frequency drivers
To: Juan RP <>
From: Quentin Garnier <>
List: tech-kern
Date: 09/01/2006 12:29:03
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Sep 01, 2006 at 09:03:56AM +0200, Juan RP wrote:
> Hi,
> Cube suggested that the way to go for drivers changing CPU frequency
> (speedstep, powernow, longrun, etc) is to use a new sysmon object.

Well, I guess I'd rather expose my points more completely.

We currently have a bunch of drivers that allow manipulation of the
frequency of the CPUs of the system.  However, they have a losy sysctl-
based interface which is not extensible.  Also, there is no API internal
to the kernel to allow such manipulation from another subsystem.

OTOH, we also have sysmon(4), used by envsys(8), powerd(8) and
wdogctl(8).  I think sysmon(4) is a good entry point for a power
management userland interface.  What comes to mind first is to have a
powerctl(8) tool that would pretty much work like wsconsctl(8) and
other similar tools.  They look like sysctl(8), but they work through a
device node.  (I can already hear David Young screaming about a unified
namespace, but that may comes later.)

The sysmon KPI already supports a number of objects:  thermal sensors,
fans, batteries and related objects, watchdogs and power event
generators.  I'd like to add a CPU object to that list.  I'm sure there
are other stuff that we might want to support later.

What can one do with a CPU?  At least 3 things:  change clock frequency,
activate throttling and have the system use certain levels of low power
states when idle.  Those 3 things are covered by the ACPI spec, there
might be equivalents for other power management architectures, and there
also might be more to do with a CPU than just that.

Another reason to have a CPU object managed by sysmon(4) is the notion
of "thermal zone" found in the ACPI spec.  A given ACPI Thermal Zone
object is associated with a number of elements included in that area.
For example, you can have one thermal zone associated with only the
CPU, because it's the CPU temperature sensor.  An ACPI Thermal Zone
object has different properties, one of them being a threshold at
which the OS must perform "passive cooling" on the associated devices.
We currently do nothing when a Thermal Zone makes such a request.  In
the case of a TZ associated with a CPU, however, we should start
manipulating the CPU to cool it down without turning on a fan (that's
mostly the meaning of "passive"), which means turning its clock freq
down and/or enabling throttling.

If sysmon(4) manages both objects, then we can have that link between
the two.

Now, I think that the sysmon(4) APIs (both kernel and userland) are
not easily extensible.  The ENVSYS ioctls, for example, cannot be
extended to support new objects without tricky defines and such.  So
that's another issue we could tackle while we're on it.

> Before starting the work, I would like to know what do I need exactly.
> I've come up with the following plan:
> We could have more than 1 driver registered (for example speedstep
> and Pentium 4 TCC (feature TM/TM2)).

I'd rather see one processor object that exposes several properties.

> * Each driver will be registered via sysmon_clockfreq_register.
> * Each driver will be unregistered via sysmon_clockfreq_unregister.

I think it's fine to start that way.  Refactoring the sysmon(4) API
can come later.  Although we can start being a bit more generic with
the name, like say "sysmon_cpu_register".

> * Each driver will return to sysmon_clockfreq:
>         - A list of working frequencies via prop_array(3) for drivers
> 	  like powernow, est and longrun. (is it ok to use prop_array?)
>         - If previous list is NULL, a pair name/value will be used
>           instead via prop_dictionary(3) (again I'm not sure what prop
> 	  type is better for this).

At first I thought that using proplib(3) for internal representation of
data wasn't a good idea, but now that I think of it, it would be a way
to make the sysmon(4) API simple and stable while we extend the list of
features it supports.  That way we can start with only the CPU
frequency, not caring about C-states and throttling.  Later we can add
support simply by adding new keys to the dictionary.

> * Each driver will get its current state frequency via CLKFREQ_GIOMODE
>   ioctl and proplib will be used to pass value between kernel/userland.
> * Each driver will change its current state frequency via CLKFREQ_SIOMODE
>   ioctl and proplib will be used to pass value between kernel/userland.

Having two ioctls is not even necessary with proplib but IMO it makes
things easier to have one for queries and one for control.  Again, I
think we can start right now with a generic name, like maybe

> I think it's enough for now, do you think am I missing something?
> What type of proplib might be used for all these fields?

The question can be asked differently:  is a list of defined frequencies
the best representation for the user?  Wouldn't be better for the driver
to expose a range, and let it deal with the details?  There might be
some CPUs that allow just about any frequency.  What do other OSes do in
their API?

In any case, a prop_array of prop_number seems fitting to me.

Quentin Garnier - -
"When I find the controls, I'll go where I like, I'll know where I want
to be, but maybe for now I'll stay right here on a silent sea."
KT Tunstall, Silent Sea, Eye to the Telescope, 2004.

Content-Type: application/pgp-signature
Content-Disposition: inline

Version: GnuPG v1.4.3 (NetBSD)