Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: XEN3_DOM0 5.99.55 on grinds to a halt shortly after boot (repeatedly)



Hi Jean-Yves,

On Jul 30, 2011, at 00:07 , Jean-Yves Migeon wrote:

> On 29.07.2011 23:28, Johan Ihrén wrote:
>> This is 5.99.55 from about July 20th/21st.
>> 
>> 1. If I boot GENERIC it doesn't hang, only XEN3_DOM0 hangs.
>> Furthermore, when GENERIC is idling, no services, nothing it is still
>> seeing ~10% interrupts on CPU0. That seems to be an awful lot...
>> 
>> 2. XEN3_DOM0 grinds to a halt also in single user, although it may
>> take slightly longer to get there.
>> 
>> 3. "Grinds to a halt" is not the same as "hangs". There is something
>> going on, it is just that it has to be measured in geological time
>> units.
> 
> It's very likely to be interrupt related, yes.
> 
>> * Once it happened during fsck and I decided to leave it to it. fsck
>> of a 30GB filesystem took several hours (but did complete).
>> 
>> * I typed "root<RETURN>" at the login prompt when the machine had
>> gone catatonic and nothing happened. An hour later there was still no
>> change. But the next morning there was the expected "Password:"
>> prompt on the console ;-)
>> 
>> Suggestions for what to try next would be much appreciated.
> 
> At ddb(4) prompt, try "show event". You may also try "show event" first,
> then "continue", then break again in ddb and show event again.

All counters are in the tens or hundreds except for two:

event type 1: vcpu0 ioapic0 pin 20 = 708325251
event type 1: vcpu0 clock = 814442

I suspect that at least the first one qualifies as "ridiculously high" ;-)

> If there's an event counter ridiculously high (rate or total count), try
> looking to which interrupt line it correspond to via "dmesg" (also
> through ddb prompt).

I'm not entirely sure what to look for here, but looking for ioapic "pins" I 
notice that a few pins (16-19) attach to various PCI devices. "Pin 20" is not 
one of them. 

However, pin 20 is mentioned later on:

pciide0: using ioapic0 pin 20, event channel 7 for native-PCI interrupt
...
pciide1: using ioapic0 pin 20, event channel 7 for native-PCI interrupt

Regards,

Johan

PS. When booting GENERIC to compare then I find this in dmesg:

pciide0: using ioapic0 pin 20 for native-PCI interrupt
...
pciide1: using ioapic0 pin 20 for native-PCI interrupt

I.e. the "event channel 7" part is unique for the XEN3_DOM0 kernel.



Home | Main Index | Thread Index | Old Index