Subject: Re: power management
To: Jachym Holecek <freza@dspfpga.com>
From: Garrett D'Amore <garrett_damore@tadpole.com>
List: tech-kern
Date: 06/23/2006 15:41:50
Jachym Holecek wrote:
> Hello Garrett,
>
> # Garrett D'Amore 2006-06-22:
>   
>> We also need a way for a device driver that wants to access a device to
>> indicate so to the framework.  Solaris has pm-busy-component and
>> pm-raise-power, etc.  Take a close look at it.
>>     
>
> Can you provide links to documentation/code? I didn't bookmark the
> manpages and can't find them now.
>   

pm_busy_component(9F) in Solaris man pages is a good start.  The SEE
ALSO section will lead you to other man pages as well.

Solaris man pages can be found on docs.sun.com.

>   
>> For example, if I'm about to write data to disk, I mark it
>> pm-busy-component, and then pm-raise-power.  This tells the power
>> management framework in kernel that the device needs to be powered up,
>> and the framework powers up the device if it is not already (and maybe,
>> for example, busses or controllers to which the device is attached!) 
>>     
>
> The "mark device as needed/about-to-be-busy" part sounds reasonable,
> but I think "raise power" should be implemented separately. Just
> indicate you want the device busy, queue the request (at least ifnet
> drivers are already separated from the network stack by buffers),
> and if someone decides you should have the device powered up, it will
> happen. It may also be reasonable to deny the request, in which case
> you'd start getting/returning errors as soon as buffers fill up/you
> make calls to powered-down device (you can check for yourself if
> device is operational).
>   

The problem is that not everything is well separated this way.  Adding
another level of asynchronicity is probably not desirable here.

However, yes, these requests can fail, and the driver that needs the
device needs to react accordingly (usually returning an errno to the
system call.)

>   
>> This can result in a recursive callback into a different power
>> management entry in the device driver, btw.
>>     
>
> I'd prefer to avoid recursion, if reasonable possible (see my note
> about kernel thread handling requests that can sleep).
>   

Whether it is run on a different thread or not, I think you don't want
these to be truly asynchronous.  It adds a big level of complexity for
little gain.

>   
>> Then when the write is done, the driver does pm-idle-component to tell
>> the framework that the device is no longer in use.
>>     
>
> Shouldn't higher layer really do this, instead of device itself?
>   

Yes and no.  The higher layer decides what to do with the idleness, but
the device itself reports the idle state.  The Solaris framework tracks
this in the power daemon, and depending on configuration, will power
down components that are idle.

By the way, this is also used to manage system-wide power
(suspend/resume), because the idleness can determine whether or not the
system has been idle for "n" seconds.

>   
>> As far as "power states", I'd look closely at the PCI and USB power
>> management specs to see what they offer.  It would be nice to have
>> support for fully using the power features supplied by the most common
>> busses.
>>     
>
> True. The PCI power management specification seems to be getting many
> concepts right, hope I can digest it over the weekend. I particularly
> like the idea of having pm-state/request semantics precised by the
> kind of device (network/disk/...) and having some kind of type-specific
> capabilities.
>
> Anyway, this & the above already cover point (3). For now, it's fine
> with me to keep powerhook request semantics, so that powerhooks can
> go away early.
>   

Okay.

    -- Garrett
> 	-- Jachym
>   


-- 
Garrett D'Amore, Principal Software Engineer
Tadpole Computer / Computing Technologies Division,
General Dynamics C4 Systems
http://www.tadpolecomputer.com/
Phone: 951 325-2134  Fax: 951 325-2191