tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: dbcool, envsys, powerd shutting down my machine



Hi,

> Depends on whether you're on AMD's virtual degC scale they use for their CPU 
> temps or it's real degC's.

I didn't realise that some CPU's don't report real values.  So, it could be
the CPU temperature after all.

> Also, if we assume that the dbcool chip controls the CPU fan(s), it makes more 
> sense for it to measure the CPU temp than the VRM temp, right?

Looking at one of the datasheets at random (ADT7476), the fan speeds can be
controlled either by a single sensor, or a mix of all the sensors, so one
would have to read more chip registers to find out exactly how it is set up.

> sysctl hw.dbcool0 gives
> hw.dbcool0.fan_ctl_0.behavior = manual
> hw.dbcool0.fan_ctl_0.min_duty = 27
> hw.dbcool0.fan_ctl_0.cur_duty = 100

> but that's with BIOS CPU fan regulation disabled.

Looking at the driver, I don't think that we alter the duty-cycle, so having
the fans run at 100% is sensible.  However, this shouldn't lead to a problem
of over temperature.

> > Is there a dbcool1 r2_temp with a similar value and limits?
> No dbcool1. But I only have one CPU installed on that board.

I guessed that there would be 2 chips of the same type, but it appears not.

> The full envstat output is
>              Current  CritMax  WarnMax  WarnMin  CritMin  Unit
> [dbcool0]
>      fan1:       N/A
>      fan2:       N/A
>      fan3:      2869                                      RPM
>      fan4:       N/A

> [lm0]
>      Fan0:       N/A
>      Fan1:      3245                                      RPM
>      Fan2:       N/A

I don't think that the chips that lm supports have fan speed adjustment,
so lm0 fan1 will always run at 100%.  It does make sense for dbcool0 fan3
to be connected to the CPU fan, but it's hard to be sure.

> Yes, of course. I was wondering where that limit came from.

I see.  All the registers that control the sensor chips can be written or
read via i2c/smbus.  The firmware or BIOS should set them up sensibly, and
we don't alter them automatically in the drivers.  For example, I have:

              Current  CritMax  WarnMax  WarnMin  CritMin  Unit
[admtemp0]
  internal:    23.000   70.000                    -65.000  degC
  external:    44.000  110.000                    -65.000  degC

where "internal" is the ambient temperature and "external" is the CPU
temperature, and OFW has set up the maximum limits, but has left the
default minimum limits.  On this machine (Sun Blade 2000), if the limits
are exceeded, it will power down instantly in hardware.

> > Also, I see that the latest BIOS version appears to be 3.09 - is that the
> > version that you have?
> I'll have a look (I will need to physically go there or can I read this from some dmesg or sysctl?).

It looks like pkgsrc/sysutils/dmidecode might be able to do this.

Regards,

J

-- 
   My other computer runs NetBSD too   -         http://www.netbsd.org/


Home | Main Index | Thread Index | Old Index