Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: -current amd64 does not boot on huge machine (80 cores, RAM 1TB)



On 06.10.11 04:38, Christoph Egger wrote:
> On 03.10.11 14:15, Nicolas Joly wrote:
>>
>> Hi,
>>
>> We just got, at work, a new toy ... This is a Supermicro SuperServer
>> 5086B-TRF[1] machine, with 80 cores and RAM 1TB. Unfortunately, i
>> cannot boot -current amd64 on it.
>>
>> Using a non DIAGNOSTIC kernel does not help, except that
>> i82489_icr_wait does not fire anymore as expected.
>>
>> Normal boot hang when probing cpu0, SMP disabled boot fails with
>> KASSERT and ACPI disabled boot hang when probing cpu1.
>>
>> Attached corresponding dmesg buffers.
>>
>> Any idea where to look for ?
>> Thanks.
>>
>> [1] http://www.supermicro.com/products/system/5U/5086/SYS-5086B-TRF.cfm
>>
> 
> From the dmesg it looks like this machine has two PCI host controllers.
> 
> If this is the case then the problem is in parsing the interrupt routing
> from ACPI.
> The parser does not deal with ACPI PCI segments. So when PCI bus, device
> and function numbers are the same then the interrupt routing from the
> first host controller is overriden with the information from the second
> PCI host controller.

The code I am talking about is in sys/arch/x86/x86/mpacpi.c

> This lets the interrupt handler wait for interrupts coming from the
> second PCI host controller while it actually came from the first one
> => hang at boot.
> 
> Christoph



Home | Main Index | Thread Index | Old Index