NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

port-xen/47057: Xen NetBSD DomU file system trash under Linux Dom0

>Number:         47057
>Category:       port-xen
>Synopsis:       Xen NetBSD DomU file system trash under Linux Dom0
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    port-xen-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Oct 11 17:20:00 +0000 2012
>Originator:     Roger Pau Monné
>Release:        6.0RC2
NetBSD  6.0_RC2 NetBSD 6.0_RC2 (XEN3_DOMU) #6: Wed Sep 26 18:06:29 BST 2012  
root@roger-xen:/root/obj/sys/arch/amd64/compile/XEN3_DOMU amd64
This problem might be related to 'port-xen/47056', and the root cause might 
actually be the same, but I'm posting them as different PR until we can figure 
out if they are related or not.

When doing heavy IO inside a NetBSD DomU backed by a Linux Dom0 I get random 
file system crashes, I've found this with FFSv1, FFSv2 with both WAPL enabled 
and disabled. The panics where about performing a free of an already free'd 
block usually, but I've also saw that sometimes on a fresh install you can end 
up with corrupted files (when performing the install from 
netbsd-INSTALL_XEN3_DOMU kernel).
As with 'port-xen/47056', the easiest way to reproduce this is to try to do a 
build of NetBSD from sources from inside a DomU backed by a MP Linux Dom0.
I'm not sure about this, but I think we have a problem with reentrancy of the 
xen event channel callback (do_hypervisor_callback in hypervisor_machdep.c), 
but I haven't been able to find a fix for this.

The right solution might be to bind all events to CPU#0 and use a 
producer/consumer approach to dispatch them to different threads. This way it 
will be easier to block all events while we are in the callback itself, and 
then it's just a matter of calling the appropriate callback from the "consumer" 
thread. Also, we will be sure that callbacks won't be nested (ie. we will not 
have reentrant callbacks).

Home | Main Index | Thread Index | Old Index