NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: port-xen/57199: Pure PVH i386 guests hang on disk activity



The following reply was made to PR kern/57199; it has been noted by GNATS.

From: Manuel Bouyer <bouyer%antioche.eu.org@localhost>
To: Taylor R Campbell <riastradh%NetBSD.org@localhost>
Cc: Brad Spencer <brad%anduin.eldar.org@localhost>, gnats-bugs%NetBSD.org@localhost,
        netbsd-bugs%NetBSD.org@localhost, gdt%lexort.com@localhost
Subject: Re: port-xen/57199: Pure PVH i386 guests hang on disk activity
Date: Mon, 15 Jul 2024 21:56:31 +0200

 On Mon, Jul 15, 2024 at 05:33:17PM +0000, Taylor R Campbell wrote:
 > I can think of two ways this patch could have an impact:
 > 
 > 1. Some Xen driver relies on write-combining memory (i.e.,
 >    `prefetchable' in PCIese and bus_dmaese), or on non-temporal
 >    stores.  This seems unlikely.
 > 
 > 2. This is a single-(v)CPU system which has patched out the lock
 >    prefix in membar_sync.
 > 
 > Unless (1) is happening, I doubt there's any reason to need mfence,
 > lfence, or sfence -- except in the circumstances of (1), mfence is
 > just a more expensive version of a locked-add for store-before-load
 > ordering, and lfence and sfence are never necessary.  See, e.g., the
 > AMD memory access ordering rules table:
 > 
 > AMD64 Architecture Programmer's Manual, Volume 2: System Programming,
 > 24593--Rev. 3.38--November 2021, Sec. 7.4.2 Memory Barrier Interaction
 > with Memory Types, Table 7-3, p. 196.
 > https://web.archive.org/web/20220625040004/https://www.amd.com/system/files/TechDocs/24593.pdf#page=256
 > 
 > 
 > Is this a single-(v)CPU system?  Can you enter crash(8) or drop into
 > ddb and disassemble the membar_sync function?  I bet you'll find no
 > lock prefix there, which would explain the hangs.
 > 
 > If my hypothesis about (2) is correct, the right thing is probably
 > either to make xen_mb be an assembly stub that does
 
 It is indeed a single-vCPU system, and in PV kernels we're probably not
 running hotpatch.
 
 > 
 > 	lock
 > 	addq $0,-8(%rsp)
 > 
 > (without the membar_sync hotpatching), or to make xen_mb be inline asm
 > to do the same.
 
 I misread the linux code in this area; mb() is not the same as smp_mb().
 Linux is in fact not used *fence instructions for virt_*mb(), but
 a lock addl for virt_mb() and just barrier() for virt_[rw]mb()
 
 So just adding a lock addl xen_mb should be enough
 
 -- 
 Manuel Bouyer <bouyer%antioche.eu.org@localhost>
      NetBSD: 26 ans d'experience feront toujours la difference
 --
 


Home | Main Index | Thread Index | Old Index