netbsd-bugs: Re: kern/32973

Subject: Re: kern/32973
To: None <port-i386-maintainer@netbsd.org, gnats-admin@netbsd.org,>
From: Perry E. Metzger <perry@piermont.com>
List: netbsd-bugs
Date: 06/12/2006 20:30:02
The following reply was made to PR port-i386/32973; it has been noted by GNATS.

From: "Perry E. Metzger" <perry@piermont.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/32973
Date: Mon, 12 Jun 2006 15:24:30 -0400

 Forwarded message:
 
 From: "Perry E. Metzger" <perry@piermont.com>
 Subject: Re: no APM idle bug...
 Date: Mon, 12 Jun 2006 14:30:59 -0400
 
 
 This post explains what I've learned in re-investigating PR 32973, the
 APM idle issue.
 
 The situation is this. A long time ago, the code in our x86 kernel that
 calls apm_cpu_idle in the cpu idle loop was ripped out. We were
 investigating putting it back in, but there turns out to be a problem
 in doing that, and it turns out that there is no point in putting it
 back in anyway.
 
 In our current idle loop, most execution happens with interrupts
 disabled. If nothing is going on, we call STI then HLT. STI enables
 interrupts following the next instruction. HLT pauses the cpu (saving
 power on a modern machine) until an interrupt comes in, even if
 interrupts are disabled. If an interrupt comes in, execution would
 continue with the next instruction, but since interrupts are enabled
 at the end of the HLT instruction by virtue of the fact that we just
 called STI before HLT, we will process that interrupt immediately.
 Just afterwards, the code checks if anything is now runable (which it
 very well might be because of the interrupt we just processed).
 
 Now, we had a bug in place for a while where we had instructions
 between the STI and the HLT. That meant that an interrupt could happen
 between the two, say a disk interrupt, and even though something might
 have been schedulable after that interrupt, we'd call HLT, and do
 nothing until the next clock interrupt passed, thus wasting up to 10ms
 of CPU time. This turned out to be a race condition that happened
 quite often because the scheduler all runs with interrupts off and
 they will be processed just as soon as the STI goes. Not surprisingly,
 performance went up after this was diagnosed and fixed.
 
 Now, what about the APM stuff?
 
 Well, the APM call has to be called with interrupts ENABLED. (The spec
 doesn't actually say that but if you try calling the APM entry point
 with interrupts disabled your machine hangs.)
 
 That means that an interrupt could come in and be processed BEFORE the
 APM code itself calls HLT, thus re-introducing the race condition and
 the performance killing bug! It is hard to fix this race condition
 because we don't control the APM implementation.
 
 So, the question then becomes, what do we gain by implementing the APM
 cpu idle call?
 
 The answer is, not much. The call would either slow the clock (not
 done on modern machines) or stop it -- i.e. it would do a HLT on our
 behalf. The documentation (see page 21 of the APM 1.2 spec if you're
 interested) claims it does *not* power down peripheral devices or any
 such thing -- just slows or halts the clock So, we don't actually gain
 anything by doing the APM cpu idle entry point that we weren't doing
 on our own already, and we introduce a race condition.
 
 As another data point, FreeBSD doesn't bother with the APM call, either.
 
 I'd therefore suggest that we close PR 32973. Valeriy, who submitted
 the PR, seems to agree with me that it should be closed.
 
 Any objections to our closing the PR and moving on?
 
 Perry