Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: -current amd64 does not boot on huge machine (80 cores, RAM 1TB)



On 03.10.11 14:15, Nicolas Joly wrote:
> 
> Hi,
> 
> We just got, at work, a new toy ... This is a Supermicro SuperServer
> 5086B-TRF[1] machine, with 80 cores and RAM 1TB. Unfortunately, i
> cannot boot -current amd64 on it.
> 
> Using a non DIAGNOSTIC kernel does not help, except that
> i82489_icr_wait does not fire anymore as expected.
> 
> Normal boot hang when probing cpu0, SMP disabled boot fails with
> KASSERT and ACPI disabled boot hang when probing cpu1.
> 
> Attached corresponding dmesg buffers.
> 
> Any idea where to look for ?
> Thanks.
> 
> [1] http://www.supermicro.com/products/system/5U/5086/SYS-5086B-TRF.cfm
> 

From the dmesg it looks like this machine has two PCI host controllers.

If this is the case then the problem is in parsing the interrupt routing
from ACPI.
The parser does not deal with ACPI PCI segments. So when PCI bus, device
and function numbers are the same then the interrupt routing from the
first host controller is overriden with the information from the second
PCI host controller.

This lets the interrupt handler wait for interrupts coming from the
second PCI host controller while it actually came from the first one
=> hang at boot.

Christoph


Home | Main Index | Thread Index | Old Index