Subject: Re: Status report: sysmon_cpufreq(9) + powerctl(8)
To: Juan RP <juan@xtrarom.org>
From: Quentin Garnier <cube@cubidou.net>
List: tech-kern
Date: 09/12/2006 22:42:00
--jRdC2OsRnuV8iIl8
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Sep 12, 2006 at 09:59:44PM +0200, Juan RP wrote:
>=20
> Hi!
>=20
> After two weeks of work, I have the sysmon_cpufreq(9) working
> with a simple driver that will return "hardcoded" values.
>=20
> I made a LKM to test the API:
>=20
> http://www.xtrarom.org/~juan/sysmon_cpufreq/cpufreq_lkm/
>=20
> The kernel part:
>=20
> http://www.xtrarom.org/~juan/sysmon_cpufreq/cpufreq.diff

The thing that your are missing in that part is that the
"smcpufreq_freqlist" and "smcpufreq_currfreq" callbacks are not needed:
the backend driver will set its own properties and all you have to do in
sysmon_cpufreq is to keep references to those properties;  that way,
the dictionary you send to userland will be built naturally.

"smcpufreq_snewfreq" can be kept just for now;  once it's time to deal
with SMP (it is not time), we'll see how that should be done.

The fact that you are able to directly reference the device's properties
means you can build the sysmon_cpufreq dictionary as the CPU objects
register themselves;  it's not needed after that, and especially not at
ioctl() time.

The last thing to note is that SMCPUFREQ_GDICT is meant to transmit the
whole dictionary, because the userland can deal with the details later.
That, and the fact that you cannot tell sysmon_cpufreq_ioctl which
object you want information on (although I believe that Jason is working
on a solution to that issue).

> And powerctl(8), the userland part:
>=20
> http://www.xtrarom.org/~juan/sysmon_cpufreq/powerctl/
>=20
> Right now, powerctl can report frequencies and voltages and set a new
> frequency with (-f value).

I suggested the name "powerctl" as something that would support much
more than that in the future.  I see it as something very much like
wsconsctl(8), where the command-line switches select the type of the
considered object, and it then displays its properties.  powerctl(8)
could be the entrypoint for CPU frequency control, but also things
like thermal zone management and sleep state control.  It could also
provide means to set policies so that powerd(8) would work magics to
make the machine perform just as the user wishes.  This is long term,
though.

[...]
> [juan@nocturno][~]> ./sysmon_cpufreq/powerctl/powerctl=20
> cpu0
>         current:        800 MHz [998 mV]
>         frequencies:    800 1200 1400 1600 (in MHz)
>         voltages:       998 1004 1120 1190 (in mV)

I honestly don't see the point of ever exposing voltage to the user.
That's purely an aspect internal to the CPU driver;  some technologies
other than SpeedStep and Cool'n'Quiet might expose different pieces of
information.

[...]
> There are some things that I'd like to fix before importing it into the
> tree:
>=20
> + Currently sysmon_cpufreq will use the first valid driver,
>    it's desirable to have more drivers working together and not
>    only one.
>=20
> + SMP case. I use ci =3D curcpu().... but how that does work=20
>    in multiprocessor systems?

As far as "ci" goes, it's in memory, i.e. available to all CPUs, and
there is a list for it.

The real challenge is to have the code tha changes the frequency run on
the correct processor.  That implies having callback functions run on a
given CPU right before leaving kernel land and switching to a user
process.  sysmon_cpufreq would only signal the CPU driver that an action
is to be taken ASAP.

> + powerctl(8) is just a toy, I want to add the following options:
> =20
> 	* Daemon mode: (a la estd), change freq when the CPU load is
> 	   high or low.

That will be powerd(8)'s job, ultimately.

> 	* Powersave mode: always lowest freq unless it requires full
> 	   power.

This will be part of a more generic framework to define power management
policies.

[...]
> (Sorry, no manpages yet... I want to finish the code first).

The code and the design.

A quick note:

+	for (smcf =3D LIST_FIRST(&sysmon_cpufreq_list); smcf !=3D NULL;
+	     smcf =3D LIST_NEXT(smcf, smcf_list)) {
+		if (strcmp(smcf->smcf_name, name) =3D=3D 0)
+			break;
+	}

You know, that's exactly what LIST_FOREACH is all about...

It's a good start, but there's still a lot to do.

--=20
Quentin Garnier - cube@cubidou.net - cube@NetBSD.org
"You could have made it, spitting out benchmarks
Owe it to yourself not to fail"
Amplifico, Spitting Out Benchmarks, Hometakes Vol. 2, 2005.

--jRdC2OsRnuV8iIl8
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (NetBSD)

iQEVAwUBRQcbmNgoQloHrPnoAQIk7Af/Xf9LWi9d3VWkyYrb+MvTO0olLkY42Nkc
PYJOUZ5adH3lumG7WKXpgucZVtlztti/4KhXJW4LuJ0dxdy74kjFqhSzf9UnH3vM
BfiNvFgexbpwoTTlt9W8EClJuqUWi2xugALElQ55JpSYLtxayxGhLcFwB/6cNZy8
SyGFcVjSAoozvB4BNEsPG+oUNOpL0yMvC9jQrKMWEpEdmXkfmi4/komVSp55qOtI
ch8lW93xaOsHPUQZPHZWax9avVYIorp38ZL3I3WpaCjVjTz6USJcFhusMZaXMIIR
vZCkwOph7x3XMLNzpdfcIcuAEp2w2x3/3tYR0iQCPoO8LYnP9GrdFg==
=DHNq
-----END PGP SIGNATURE-----

--jRdC2OsRnuV8iIl8--