Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: XEN3_DOM0 5.99.55 on grinds to a halt shortly after boot (repeatedly)

On 29.07.2011 23:28, Johan Ihrén wrote:
> This is 5.99.55 from about July 20th/21st.
> Let me start with the observation that this is hardware related, as
> the same exact 5.99.55 disk works like a charm in an older Core 2 Quad
> machine. But when I boot from this disk in a machine with a brand new
> Core i7 + H67 chipset it works "less well".
> If I boot the standard XEN3_DOM0 kernel I typically get to the login
> prompt and usually manage to login and give one or two commands before
> the machine basically stops. When I break into DDB it looks like this
> (typed by hand, no serial console):
> Stopped in pid 0.2 (system)
> breakpoint()
> wskbd_translate{}
> wskbd_input()
> pckbd_input()
> pckbcintr()
> evtchn_do_event()
> call_evtchn_do_event()
> hypervisor_callback()
> idle_loop()
> The interesting thing is that this doesn't look bad to me. This seems
> to be mostly identical to what I'd see if I break into DDB on a
> machine that's working just fine. So what is it doing? I don't know.

You are correct.

> 1. If I boot GENERIC it doesn't hang, only XEN3_DOM0 hangs.
> Furthermore, when GENERIC is idling, no services, nothing it is still
> seeing ~10% interrupts on CPU0. That seems to be an awful lot...
> 2. XEN3_DOM0 grinds to a halt also in single user, although it may
> take slightly longer to get there.
> 3. "Grinds to a halt" is not the same as "hangs". There is something
> going on, it is just that it has to be measured in geological time
> units.

It's very likely to be interrupt related, yes.

> * Once it happened during fsck and I decided to leave it to it. fsck
> of a 30GB filesystem took several hours (but did complete).
> * I typed "root<RETURN>" at the login prompt when the machine had
> gone catatonic and nothing happened. An hour later there was still no
> change. But the next morning there was the expected "Password:"
> prompt on the console ;-)
> Suggestions for what to try next would be much appreciated.

At ddb(4) prompt, try "show event". You may also try "show event" first,
then "continue", then break again in ddb and show event again.

If there's an event counter ridiculously high (rate or total count), try
looking to which interrupt line it correspond to via "dmesg" (also
through ddb prompt).

Jean-Yves Migeon

Home | Main Index | Thread Index | Old Index