Subject: Re: kern/32973
To: None <port-i386-maintainer@netbsd.org, gnats-admin@netbsd.org,>
From: Perry E. Metzger <perry@piermont.com>
List: netbsd-bugs
Date: 06/12/2006 20:30:02
The following reply was made to PR port-i386/32973; it has been noted by GNATS.
From: "Perry E. Metzger" <perry@piermont.com>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/32973
Date: Mon, 12 Jun 2006 15:24:30 -0400
Forwarded message:
From: "Perry E. Metzger" <perry@piermont.com>
Subject: Re: no APM idle bug...
Date: Mon, 12 Jun 2006 14:30:59 -0400
This post explains what I've learned in re-investigating PR 32973, the
APM idle issue.
The situation is this. A long time ago, the code in our x86 kernel that
calls apm_cpu_idle in the cpu idle loop was ripped out. We were
investigating putting it back in, but there turns out to be a problem
in doing that, and it turns out that there is no point in putting it
back in anyway.
In our current idle loop, most execution happens with interrupts
disabled. If nothing is going on, we call STI then HLT. STI enables
interrupts following the next instruction. HLT pauses the cpu (saving
power on a modern machine) until an interrupt comes in, even if
interrupts are disabled. If an interrupt comes in, execution would
continue with the next instruction, but since interrupts are enabled
at the end of the HLT instruction by virtue of the fact that we just
called STI before HLT, we will process that interrupt immediately.
Just afterwards, the code checks if anything is now runable (which it
very well might be because of the interrupt we just processed).
Now, we had a bug in place for a while where we had instructions
between the STI and the HLT. That meant that an interrupt could happen
between the two, say a disk interrupt, and even though something might
have been schedulable after that interrupt, we'd call HLT, and do
nothing until the next clock interrupt passed, thus wasting up to 10ms
of CPU time. This turned out to be a race condition that happened
quite often because the scheduler all runs with interrupts off and
they will be processed just as soon as the STI goes. Not surprisingly,
performance went up after this was diagnosed and fixed.
Now, what about the APM stuff?
Well, the APM call has to be called with interrupts ENABLED. (The spec
doesn't actually say that but if you try calling the APM entry point
with interrupts disabled your machine hangs.)
That means that an interrupt could come in and be processed BEFORE the
APM code itself calls HLT, thus re-introducing the race condition and
the performance killing bug! It is hard to fix this race condition
because we don't control the APM implementation.
So, the question then becomes, what do we gain by implementing the APM
cpu idle call?
The answer is, not much. The call would either slow the clock (not
done on modern machines) or stop it -- i.e. it would do a HLT on our
behalf. The documentation (see page 21 of the APM 1.2 spec if you're
interested) claims it does *not* power down peripheral devices or any
such thing -- just slows or halts the clock So, we don't actually gain
anything by doing the APM cpu idle entry point that we weren't doing
on our own already, and we introduce a race condition.
As another data point, FreeBSD doesn't bother with the APM call, either.
I'd therefore suggest that we close PR 32973. Valeriy, who submitted
the PR, seems to agree with me that it should be closed.
Any objections to our closing the PR and moving on?
Perry