Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: cpu temperature readings



kre%munnari.OZ.AU@localhost (Robert Elz) writes:

>  | You can probably avoid this, if you limit the chip to performance of the
>  | non-selected die (in real applications it will probably lose 1-5%). The
>  | BIOS should have a setting for the cTDP value that you can play with.

>If I am understanding you, which I might not be, you mean slow down the
>fastest cores from 5.5GHz (two cores are currently allowed to run that
>fast, I found the settings for that) to (probably) 5.2GHz - the other
>6 performance cores are currently limited to that (and I think that's
>as fast as they're normally expected to run).

Turbo speed is controlled by the power (dissipation) budget. On
some CPUs you don't have to control the clock itself, but the
available power, and that should also be possible with the
i9-12900.

In the end that means the chip either won't reach it's maximum turbo
speed, or only for a shorter time, or only when cooled better. The
value that corresponds to this is called cTDP (and usually used
to raise the value for extreme overclocking, but it can also be
reduced).

I haven't seen such a setting in the Asrock Z690 BIOS though.


>[Aside: I also noticed that the BIOS claims that the min available
>frequency is 400MHz ... NetBSD thinks 800MHz is as slow as it should go,
>that's the min value in machdep.cpu.frequency.available].

The values probably come from ACPI. I first thought there was a limit
of 16 states, but we (arbitrarily) have a limit of 256. So either
ACPI doesn't show all states that you can see in the BIOS interface
or we have a bug.


>I got to look at all that as the system shut itself down again in the early
>hours of this morning (here) - A/C was on, so room was cool, I had turbo
>mode enabled, just to see if it would still cause a problem, and it seems
>that it did (at the minute, as long as I leave that off, the system is
>stable).

coretemp doesn't have thresholds, so it cannot trigger powerd to shut down.


>  Note that I am still just guessing that thermal issues are what
>is causing this, almost always the system is just running fine, with
>envstat reporting elevated temperatures, but nothing close to 100 - the
>highest I saw before the shutdown were in the low 60's - but I wasn't
>actually watching those numbers at the time), and then it is off.
>No warning (that I saw anyway) - just off.   This time I restarted
>immediately, and as soon as I could, looked at the BIOS's cpu temp
>value - that was about 36C.   But the BIOS doesn't use turbo mode I
>don't believe, so it would have been running slower, and the BIOS spends
>quite a bit of time doing whatever it does, before it allows any kind of
>interaction.

Immediate power off also doesn't suggest that this is a shutdown. I would
guess it's either the CPU reaching its limit (unlikely to your description,
but the temperature can change very very quickly) or something completely
different (motherboard power regulators or even the PSU?).

On server motherboards you would often have some BMC logging the issue.
The Z690 Taichi BIOS seems to have an event log, not sure what it actually
logs.



Home | Main Index | Thread Index | Old Index