Subject: Re: Different speed CPUs show up as same speed
To: None <tech-smp@netbsd.org>
From: John Klos <john@sixgirls.org>
List: tech-smp
Date: 06/17/2002 19:45:37
> I only see problems where there are problems; we're not talking about
> hypothetical computers, we're talking about actual computers where we can
> actually run code. I know that my kernels and none of my binaries are
> compiled with SSE. I also know that most distributed binaries certainly
> never use such code.

Well but what is with others? Having the possibility to use Intel's C Com=
piler=20
enables one to use SSE without changing a bit of code, the speed improvem=
ent=20
is very high (on my box it was factor 2 compared to gcc 2.95.3 with povra=
y).=20
So there is a point - just more logic has to be implemented in the kernel=
,=20
for just a bunch of people - do we want to be an operating system designe=
d=20
only for a friction of computer users?

Are you running Intel's C compiler on NetBSD? If so, don't you think you'd
know if you were compiling binaries with SSE support?

Your concerns are unfounded - no extra support is needed. We're not
talking about "more logic" which "has to be implemented". It works and it
works now. I'd like to see that the code is correct in that it either
reports the proper speed for each CPU or only reports the speed of one.


> An example where assumptions would be bad: what if a CPU in a dual
> processor system started overheating, and was automatically throttled
> down? Would we want our kernel to panic because one CPU is now a different
> speed than the other? I prefer correct code.

I don't think this is a good example, because the CPUs are assumed to run=
 on=20
full speed under normal circumstances. If one cpu get's down, the best=20
solution would only be to shut down this box to exchange broken hardware =
or=20
to set up better cooling. If the box built up correctly this should
only=20
happen with broken fans, which most mp boards signal with loud beeping, s=
o=20
why not changing fan before the cpu overheats (most intel cpus in my=20
experience run for some time, before overheating if a heatsink is attache=
d).=20
And I don't think the kernel will panic, as far as I understand these=20
calibrations loops, they are for scheduling and stuff, when an exact numb=
er=20
of seconds or so has to be hit, so some process just doesn't get it's tim=
e or=20
is woke up too late, the system will run, but slow, doesn't it?

Saying that "it shouldn't happen, so why support it?" is not a part of the
NetBSD philosophy. Of course CPUs overheat from time to time. CPUs even
overheat when the fans are working and heat sink compound has been
properly applied. Imagine if an air conditioner failed in the summer or
something.

My point is still the same - a calibrated loop, which is not the same
thing that you're talking about (we're talking microseconds, not seconds)
in a sensitive part of some kernel code should know what processor it is
running on; this would be correct. Assuming that an overheating computer
is broken and therefore deciding that we don't care about what happens
then would be irresponsible.

Again, I am not trying to start some sort of "support campaign", but
rather I am slightly bothered by some people's desire to see this declared
"bad" rather than out of the ordinary, but cool and possibly worthy of a
little speculation about how considering such possibilities might make for
more correct code in the long run.

That's all!

(BTW - what's up with all of those =20s?)

Thanks,
John Klos
Sixgirls Computing Labs