Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Panic on dom0 shutdown

Hi Manuel,

On Dec 29, 2012, at 20:41 , Manuel Bouyer wrote:

> On Fri, Dec 28, 2012 at 05:36:03PM +0100, Johan Ihrén wrote:
>> Hi guys,
>> Holidays intervened for a few days...
>> On Dec 20, 2012, at 20:41 , David Laight wrote:
>>> On Thu, Dec 20, 2012 at 07:39:18PM +0100, Manuel Bouyer wrote:
>>>> On Thu, Dec 20, 2012 at 06:42:14PM +0100, Johan Ihr?n wrote:
>>>>> I have a (reproducible) panic on shutdown, also recentish NetBSD 6 
>>>>> (kernel form december 16), but no raidframe. In my case it only panics on 
>>>>> "halt -p", not on "halt" (100% reproducible). The trace is similar but 
>>>>> not identical (this is by hand, no serial console available):
>>>>> kernel: page fault trap, code=0
>>>>> Stopped in pid 5800.1 (halt) at netbsd:bus_space_read_4*0x8:
>>>>> bus_space_read_4()
>>>>> Xresume_xenev6() at netbsd:Xresume_xenev6+0x47
>>>> there's something missing here, bus_space_read_4() is not called directly
>>>> by Xresume_xenev6().
>> Not questioning that, but the trace as I typed it looks like above.
>>>> Is it i386 or amd64 ? If amd64, please make sure
>>>> you still have -fno-omit-frame-pointer in kernel build options 
>>>> (makeoptions)
>> It's amd64, a standard XEN3_DOM0-kernel grabbed as a binary from a daily 
>> build at from www.fr.netbsd.org. I've built a local kernel also, ensuring 
>> that -fno-omit-frame-pointer is in the makeoptions. I've made one change and 
>> that is adding
>> usb* at ehci? flags 1
>> to get the USB keyboard working during boot.
>>> You might want to disable tail-calls if you want a full trace back.
>> I assume that may be the cause of the trace not being correct. Ok, so adding 
>> "-fno-optimize-sibling-calls" to makeoptions (is that the correct way of 
>> disabling tail-calls?) causes the trace to change like this:
>> kernel: page fault trap, code=0
>> Stopped in pid 634.1 (halt) at netbsd:bus_space_read_4*0x8:
>> bus_space_read_4()
>> pirq_interrupt()
>> Xresume_xenev6() at netbsd:Xresume_xenev6+0x47
> There is still probably a tail-call happening here; pirq_interrupt() just 
> calls
> a function pointer. There's isn't much ways to know what this function pointer
> points to; and this is where the faulty bus_space_read_4() is happening ...

Hmm. So using "-fno-optimize-sibling-calls" is not sufficient to avoid all 
tail-calls? Is there any other way of forcing no tail-calls? Otherwise it would 
seem that while tail-calls obviously improve performace they also make 
debugging... close to impossible.



Home | Main Index | Thread Index | Old Index