tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: A hint for my Core 2 Duo MP bug
Mindaugas,
this is kern/38798, but I am unsure I've filed it under the right place.
Is there a filled PR about this? If no, can you please file it. Can you
provide more details about the problem (eg. dmesg, backtrace if crashes)?
Have you tried only amd64, or also i386?
Both. i386 and amd64 behave the same way, this makes me think the bug
lies in the common x86 code, or in a file that is identical in both. But
I am totally unable to fathom out what is going down there.
The problem itself is very simple. At boot, cpu0 starts correctly, cpu1
won't — I've got the message "cpu1 failed to become ready". The kernel
goes further for a while, then stops before it forks init (usually, it
hangs while scanning the ATAPI bus).
Deeper inside, together with Andrew we set up a basic trace in
mptramp.S, more precisely in the cpu_spinup_trampoline function. None of
the HALT macros is reached. The trace stays locked at 40 FF FF during
all the delay loop, as if cpu_spinup_trampoline wasn't called or executed.
Yet, I've tried to poke around and deliberately introduce mistakes in
the code to see if they had any consequences or if the second core was
held in a permanent halt state. If I remove the passage into protected
mode, nothing happens. But if I remove the .code16 preamble ahead of the
first part of the function, the computer enters a boot/reboot cycle. It
seems the code gets executed somehow by the second core, but way after
the delay loop has expired.
Yet it is not a delay problem: I've tried to increase the delay loop by
one or two orders of magnitude, to no avail. Something else strange: for
a while, I could get both core started this way: compiling the kernel
with the MPDEBUG option, it would drop in ddd after the "cpu1 failed to
become ready" message ; then simply typing "cont" would start cpu1 and
resume normal kernel boot with both cores enabled. This "workaround"
ceased to work at some point.
On the whole, it seems the second core is waiting for the first doing
something, something that doesn't not happen or happens too late or out
of sync. NetBSD 4 boots both cores correctly, so do FreeBSD and Linux.
Andrew suspected a BIOS disorder, but since all the other OS work
correctly, I am suspecting something has changed in MP boot that affects
especially this machine (but why?).
There it is, thanks for your interest.
Vincent
Home |
Main Index |
Thread Index |
Old Index