Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: cpu temperature readings



    Date:        Sat, 1 Jul 2023 13:18:50 -0000 (UTC)
    From:        mlelstv%serpens.de@localhost (Michael van Elst)
    Message-ID:  <u7p93i$bpp$1%serpens.de@localhost>

  | To support the "turbo" speeds, you need higher voltages and it is plausible
  | that the voltages need to be set for the worst case because switching the
  | clock to "turbo" doesn't control the voltages (or not fast/precise enough).

That makes sense, thanks for the explanation.

  | You can probably avoid this, if you limit the chip to performance of the
  | non-selected die (in real applications it will probably lose 1-5%). The
  | BIOS should have a setting for the cTDP value that you can play with.

If I am understanding you, which I might not be, you mean slow down the
fastest cores from 5.5GHz (two cores are currently allowed to run that
fast, I found the settings for that) to (probably) 5.2GHz - the other
6 performance cores are currently limited to that (and I think that's
as fast as they're normally expected to run).   That I can do, I could
even make all of then 5Ghz (the max freq, in units of 100MHz, can be set
for each core, separately).   That minor reduction isn't likely to matter.

[Aside: I also noticed that the BIOS claims that the min available
frequency is 400MHz ... NetBSD thinks 800MHz is as slow as it should go,
that's the min value in machdep.cpu.frequency.available].


I got to look at all that as the system shut itself down again in the early
hours of this morning (here) - A/C was on, so room was cool, I had turbo
mode enabled, just to see if it would still cause a problem, and it seems
that it did (at the minute, as long as I leave that off, the system is
stable).   Note that I am still just guessing that thermal issues are what
is causing this, almost always the system is just running fine, with
envstat reporting elevated temperatures, but nothing close to 100 - the
highest I saw before the shutdown were in the low 60's - but I wasn't
actually watching those numbers at the time), and then it is off.
No warning (that I saw anyway) - just off.   This time I restarted
immediately, and as soon as I could, looked at the BIOS's cpu temp
value - that was about 36C.   But the BIOS doesn't use turbo mode I
don't believe, so it would have been running slower, and the BIOS spends
quite a bit of time doing whatever it does, before it allows any kind of
interaction.

Note "early hours" here means very early, at 14:17 now, the system has
now been up 13:44, so the shutdown must have been between 00:00 and 00:30.
That's well before cron starts running any of the daily/weekly stuff, so
the system should still have been mostly idle (no builds happening, not
even cvs update, or anything like that - just a couple of unrelated net
downloads happening in the background, and not all that quickly at that,
maybe 3Mbps total, probably slightly less).

When that happened, I had seen your message, but hadn't formed any real
comprehension as to what it might have meant - but that's what inspired
me to go looking at BIOS settings I would never normally go near, and where
I found (but did not alter) the "max turbo rate" (per core) settings.

kre




Home | Main Index | Thread Index | Old Index