Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: NetBSD DomU MP freeze under Linux Dom0



On Thu, Sep 06, 2012 at 12:57:19PM +0200, Roger Pau Monne wrote:
> Hello,
> 
> Recently I've been doing some benchmarks on NetBSD, to compare the
> performances of both NetBSD and Linux as Dom0/DomUs (this was presented
> on XenSummit last week with Cherry G. Mathew, slides will probably be
> uploaded soon).
> 
> One of the benchmarks consisted in running build.sh inside a DomU, and
> during this test I've realised that this lead to a freeze when running a
> Linux Dom0 and a NetBSD DomU with 4vcpus. So far I haven't been able to
> reproduce the problem without MP or in a NetBSD Dom0, which is kind of
> strange, because I would say it is not related to blkfront, I've added
> some debugging prints there, and blkfront seems to not be the owner of
> the lock when the freeze happens. The build of NetBSD inside the DomU
> was using 8 simultaneous jobs, and it freezes to a point where I can not
> even access ddb. I've been able to get a trace using gdbsx:
> 
> Thread 4:
> 
> #0  0xffffffff80101248 in hypercall_page ()
> #1  0x000000000000e033 in ?? ()
> #2  0x0000000000000000 in ?? ()
> 
> Thread 3:
> 
> #0  0xffffffff80130f32 in x86_pause ()
> #1  0xffffffff801f67b1 in _kernel_lock ()
> #2  0xffffffff8030b054 in bdev_strategy ()
> #3  0xffffffff803037d8 in spec_strategy ()
> #4  0xffffffff803a719a in VOP_STRATEGY ()
> #5  0xffffffff8035ff7a in ufs_strategy ()
> #6  0xffffffff803a719a in VOP_STRATEGY ()
> #7  0xffffffff8038d3fa in bwrite ()
> #8  0xffffffff803a6320 in VOP_BWRITE ()
> #9  0xffffffff80357125 in ufs_dirremove ()
> #10 0xffffffff8035dc47 in ufs_remove ()
> #11 0xffffffff803a6b53 in VOP_REMOVE ()
> #12 0xffffffff8039ac4f in do_sys_unlink ()
> #13 0xffffffff8032b044 in syscall ()
> #14 0xffffffff8010221d in Xsyscall ()
> 
> Thread 2:
> 
> #0  0xffffffff801f67b1 in _kernel_lock ()
> #1  0xffffffff8030b054 in bdev_strategy ()
> #2  0xffffffff803037d8 in spec_strategy ()
> #3  0xffffffff803a719a in VOP_STRATEGY ()
> #4  0xffffffff8035ff7a in ufs_strategy ()
> #5  0xffffffff803a719a in VOP_STRATEGY ()
> #6  0xffffffff8038d3fa in bwrite ()
> #7  0xffffffff803a6320 in VOP_BWRITE ()
> #8  0xffffffff80357125 in ufs_dirremove ()
> #9  0xffffffff8035dc47 in ufs_remove ()
> #10 0xffffffff803a6b53 in VOP_REMOVE ()
> #11 0xffffffff8039ac4f in do_sys_unlink ()
> #12 0xffffffff8032b044 in syscall ()
> #13 0xffffffff8010221d in Xsyscall ()
> 
> Thread 1:
> 
> #0  0xffffffff801f67b1 in _kernel_lock ()
> #1  0xffffffff8030b054 in bdev_strategy ()
> #2  0xffffffff803037d8 in spec_strategy ()
> #3  0xffffffff803a719a in VOP_STRATEGY ()
> #4  0xffffffff8035ff7a in ufs_strategy ()
> #5  0xffffffff803a719a in VOP_STRATEGY ()
> #6  0xffffffff8038d3fa in bwrite ()
> #7  0xffffffff803a6320 in VOP_BWRITE ()
> #8  0xffffffff80357125 in ufs_dirremove ()
> #9  0xffffffff8035dc47 in ufs_remove ()
> #10 0xffffffff803a6b53 in VOP_REMOVE ()
> #11 0xffffffff8039ac4f in do_sys_unlink ()
> #12 0xffffffff8032b044 in syscall ()
> #13 0xffffffff8010221d in Xsyscall ()
> 
> My guess is that Thread 4 is holding the lock, and it's blocked for some
> reason that's beyond my current knowledge of NetBSD internals, and the
> stack trace is not helping on that.

Do you have a way to know what hypercall thread 4 is doing ?
it looks like it's doing an hypercall with the kernel_lock held,
and this hypercall blocks.

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--


Home | Main Index | Thread Index | Old Index