pkgsrc-Changes archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
CVS commit: pkgsrc/sysutils/xenkernel411
Module Name: pkgsrc
Committed By: bouyer
Date: Thu Jul 16 09:57:17 UTC 2020
Modified Files:
pkgsrc/sysutils/xenkernel411: Makefile distinfo
Added Files:
pkgsrc/sysutils/xenkernel411/patches: patch-XSA317 patch-XSA319
patch-XSA320 patch-XSA321 patch-XSA328
Log Message:
Add patches for Xen Security Advisories XSA317, XSA319, XSA320, XSA321
and XSA328
Bump PKGREVISION
To generate a diff of this commit:
cvs rdiff -u -r1.13 -r1.14 pkgsrc/sysutils/xenkernel411/Makefile
cvs rdiff -u -r1.11 -r1.12 pkgsrc/sysutils/xenkernel411/distinfo
cvs rdiff -u -r0 -r1.1 pkgsrc/sysutils/xenkernel411/patches/patch-XSA317 \
pkgsrc/sysutils/xenkernel411/patches/patch-XSA319 \
pkgsrc/sysutils/xenkernel411/patches/patch-XSA320 \
pkgsrc/sysutils/xenkernel411/patches/patch-XSA321 \
pkgsrc/sysutils/xenkernel411/patches/patch-XSA328
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
Modified files:
Index: pkgsrc/sysutils/xenkernel411/Makefile
diff -u pkgsrc/sysutils/xenkernel411/Makefile:1.13 pkgsrc/sysutils/xenkernel411/Makefile:1.14
--- pkgsrc/sysutils/xenkernel411/Makefile:1.13 Wed Apr 15 15:37:19 2020
+++ pkgsrc/sysutils/xenkernel411/Makefile Thu Jul 16 09:57:17 2020
@@ -1,7 +1,7 @@
-# $NetBSD: Makefile,v 1.13 2020/04/15 15:37:19 bouyer Exp $
+# $NetBSD: Makefile,v 1.14 2020/07/16 09:57:17 bouyer Exp $
VERSION= 4.11.3
-PKGREVISION= 2
+PKGREVISION= 3
DISTNAME= xen-${VERSION}
PKGNAME= xenkernel411-${VERSION}
CATEGORIES= sysutils
Index: pkgsrc/sysutils/xenkernel411/distinfo
diff -u pkgsrc/sysutils/xenkernel411/distinfo:1.11 pkgsrc/sysutils/xenkernel411/distinfo:1.12
--- pkgsrc/sysutils/xenkernel411/distinfo:1.11 Wed Apr 15 15:45:04 2020
+++ pkgsrc/sysutils/xenkernel411/distinfo Thu Jul 16 09:57:17 2020
@@ -1,4 +1,4 @@
-$NetBSD: distinfo,v 1.11 2020/04/15 15:45:04 bouyer Exp $
+$NetBSD: distinfo,v 1.12 2020/07/16 09:57:17 bouyer Exp $
SHA1 (xen411/xen-4.11.3.tar.gz) = 2d77152168d6f9dcea50db9cb8e3e6a0720a4a1b
RMD160 (xen411/xen-4.11.3.tar.gz) = cfb2e699842867b60d25a01963c564a6c5e580da
@@ -12,7 +12,12 @@ SHA1 (patch-XSA310) = 77b711f4b75de1d473
SHA1 (patch-XSA311) = 4d3e6cc39c2b95cb3339961271df2bc885667927
SHA1 (patch-XSA313) = b2f281d6aed1207727cd454dcb5e914c7f6fb44b
SHA1 (patch-XSA316) = 9cce683315e4c1ca6d53b578e69ae71e1db2b3eb
+SHA1 (patch-XSA317) = 3a3e7bf8f115bebaf56001afcf68c2bd501c00a5
SHA1 (patch-XSA318) = d0dcbb99ab584098aed7995a7a05d5bf4ac28d47
+SHA1 (patch-XSA319) = 4954bdc849666e1c735c3281256e4850c0594ee8
+SHA1 (patch-XSA320) = 38d84a2ded4ccacee455ba64eb3b369e5661fbfd
+SHA1 (patch-XSA321) = 5281304282a26ee252344ec26b07d25ac4ce8b54
+SHA1 (patch-XSA328) = a9b02c183a5dbfb6c0fe50824f18896fcab4a9e9
SHA1 (patch-xen_Makefile) = 465388d80de414ca3bb84faefa0f52d817e423a6
SHA1 (patch-xen_Rules.mk) = c743dc63f51fc280d529a7d9e08650292c171dac
SHA1 (patch-xen_arch_x86_Rules.mk) = 0bedfc53a128a87b6a249ae04fbdf6a053bfb70b
Added files:
Index: pkgsrc/sysutils/xenkernel411/patches/patch-XSA317
diff -u /dev/null pkgsrc/sysutils/xenkernel411/patches/patch-XSA317:1.1
--- /dev/null Thu Jul 16 09:57:17 2020
+++ pkgsrc/sysutils/xenkernel411/patches/patch-XSA317 Thu Jul 16 09:57:17 2020
@@ -0,0 +1,52 @@
+$NetBSD: patch-XSA317,v 1.1 2020/07/16 09:57:17 bouyer Exp $
+
+From aeb46e92f915f19a61d5a8a1f4b696793f64e6fb Mon Sep 17 00:00:00 2001
+From: Julien Grall <jgrall%amazon.com@localhost>
+Date: Thu, 19 Mar 2020 13:17:31 +0000
+Subject: [PATCH] xen/common: event_channel: Don't ignore error in
+ get_free_port()
+
+Currently, get_free_port() is assuming that the port has been allocated
+when evtchn_allocate_port() is not return -EBUSY.
+
+However, the function may return an error when:
+ - We exhausted all the event channels. This can happen if the limit
+ configured by the administrator for the guest ('max_event_channels'
+ in xl cfg) is higher than the ABI used by the guest. For instance,
+ if the guest is using 2L, the limit should not be higher than 4095.
+ - We cannot allocate memory (e.g Xen has not more memory).
+
+Users of get_free_port() (such as EVTCHNOP_alloc_unbound) will validly
+assuming the port was valid and will next call evtchn_from_port(). This
+will result to a crash as the memory backing the event channel structure
+is not present.
+
+Fixes: 368ae9a05fe ("xen/pvshim: forward evtchn ops between L0 Xen and L2 DomU")
+Signed-off-by: Julien Grall <jgrall%amazon.com@localhost>
+Reviewed-by: Jan Beulich <jbeulich%suse.com@localhost>
+---
+ xen/common/event_channel.c | 8 ++++----
+ 1 file changed, 4 insertions(+), 4 deletions(-)
+
+diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
+index e86e2bfab0..a8d182b584 100644
+--- xen/common/event_channel.c.orig
++++ xen/common/event_channel.c
+@@ -195,10 +195,10 @@ static int get_free_port(struct domain *d)
+ {
+ int rc = evtchn_allocate_port(d, port);
+
+- if ( rc == -EBUSY )
+- continue;
+-
+- return port;
++ if ( rc == 0 )
++ return port;
++ else if ( rc != -EBUSY )
++ return rc;
+ }
+
+ return -ENOSPC;
+--
+2.17.1
+
Index: pkgsrc/sysutils/xenkernel411/patches/patch-XSA319
diff -u /dev/null pkgsrc/sysutils/xenkernel411/patches/patch-XSA319:1.1
--- /dev/null Thu Jul 16 09:57:17 2020
+++ pkgsrc/sysutils/xenkernel411/patches/patch-XSA319 Thu Jul 16 09:57:17 2020
@@ -0,0 +1,29 @@
+$NetBSD: patch-XSA319,v 1.1 2020/07/16 09:57:17 bouyer Exp $
+
+From: Jan Beulich <jbeulich%suse.com@localhost>
+Subject: x86/shadow: correct an inverted conditional in dirty VRAM tracking
+
+This originally was "mfn_x(mfn) == INVALID_MFN". Make it like this
+again, taking the opportunity to also drop the unnecessary nearby
+braces.
+
+This is XSA-319.
+
+Fixes: 246a5a3377c2 ("xen: Use a typesafe to define INVALID_MFN")
+Signed-off-by: Jan Beulich <jbeulich%suse.com@localhost>
+Reviewed-by: Andrew Cooper <andrew.cooper3%citrix.com@localhost>
+
+--- xen/arch/x86/mm/shadow/common.c.orig
++++ xen/arch/x86/mm/shadow/common.c
+@@ -3252,10 +3252,8 @@ int shadow_track_dirty_vram(struct domai
+ int dirty = 0;
+ paddr_t sl1ma = dirty_vram->sl1ma[i];
+
+- if ( !mfn_eq(mfn, INVALID_MFN) )
+- {
++ if ( mfn_eq(mfn, INVALID_MFN) )
+ dirty = 1;
+- }
+ else
+ {
+ page = mfn_to_page(mfn);
Index: pkgsrc/sysutils/xenkernel411/patches/patch-XSA320
diff -u /dev/null pkgsrc/sysutils/xenkernel411/patches/patch-XSA320:1.1
--- /dev/null Thu Jul 16 09:57:17 2020
+++ pkgsrc/sysutils/xenkernel411/patches/patch-XSA320 Thu Jul 16 09:57:17 2020
@@ -0,0 +1,371 @@
+$NetBSD: patch-XSA320,v 1.1 2020/07/16 09:57:17 bouyer Exp $
+
+From: Andrew Cooper <andrew.cooper3%citrix.com@localhost>
+Subject: x86/spec-ctrl: CPUID/MSR definitions for Special Register Buffer Data Sampling
+
+This is part of XSA-320 / CVE-2020-0543
+
+Signed-off-by: Andrew Cooper <andrew.cooper3%citrix.com@localhost>
+Reviewed-by: Jan Beulich <jbeulich%suse.com@localhost>
+Acked-by: Wei Liu <wl%xen.org@localhost>
+
+diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
+index 194615bfc5..9be18ac99f 100644
+--- docs/misc/xen-command-line.markdown.orig
++++ docs/misc/xen-command-line.markdown
+@@ -489,10 +489,10 @@ accounting for hardware capabilities as enumerated via CPUID.
+
+ Currently accepted:
+
+-The Speculation Control hardware features `md-clear`, `ibrsb`, `stibp`, `ibpb`,
+-`l1d-flush` and `ssbd` are used by default if available and applicable. They can
+-be ignored, e.g. `no-ibrsb`, at which point Xen won't use them itself, and
+-won't offer them to guests.
++The Speculation Control hardware features `srbds-ctrl`, `md-clear`, `ibrsb`,
++`stibp`, `ibpb`, `l1d-flush` and `ssbd` are used by default if available and
++applicable. They can be ignored, e.g. `no-ibrsb`, at which point Xen won't
++use them itself, and won't offer them to guests.
+
+ ### cpuid\_mask\_cpu (AMD only)
+ > `= fam_0f_rev_c | fam_0f_rev_d | fam_0f_rev_e | fam_0f_rev_f | fam_0f_rev_g | fam_10_rev_b | fam_10_rev_c | fam_11_rev_b`
+diff --git a/tools/libxl/libxl_cpuid.c b/tools/libxl/libxl_cpuid.c
+index 5a1702d703..1235c8b91e 100644
+--- tools/libxl/libxl_cpuid.c.orig
++++ tools/libxl/libxl_cpuid.c
+@@ -202,6 +202,7 @@ int libxl_cpuid_parse_config(libxl_cpuid_policy_list *cpuid, const char* str)
+
+ {"avx512-4vnniw",0x00000007, 0, CPUID_REG_EDX, 2, 1},
+ {"avx512-4fmaps",0x00000007, 0, CPUID_REG_EDX, 3, 1},
++ {"srbds-ctrl", 0x00000007, 0, CPUID_REG_EDX, 9, 1},
+ {"md-clear", 0x00000007, 0, CPUID_REG_EDX, 10, 1},
+ {"ibrsb", 0x00000007, 0, CPUID_REG_EDX, 26, 1},
+ {"stibp", 0x00000007, 0, CPUID_REG_EDX, 27, 1},
+diff --git a/tools/misc/xen-cpuid.c b/tools/misc/xen-cpuid.c
+index 4c9af6b7f0..8fb54c3001 100644
+--- tools/misc/xen-cpuid.c.orig
++++ tools/misc/xen-cpuid.c
+@@ -142,6 +142,7 @@ static const char *str_7d0[32] =
+ {
+ [ 2] = "avx512_4vnniw", [ 3] = "avx512_4fmaps",
+
++ /* 8 */ [ 9] = "srbds-ctrl",
+ [10] = "md-clear",
+ /* 12 */ [13] = "tsx-force-abort",
+
+diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
+index 04aefa555d..b8e5b6fe67 100644
+--- xen/arch/x86/cpuid.c.orig
++++ xen/arch/x86/cpuid.c
+@@ -58,6 +58,11 @@ static int __init parse_xen_cpuid(const char *s)
+ if ( !val )
+ setup_clear_cpu_cap(X86_FEATURE_SSBD);
+ }
++ else if ( (val = parse_boolean("srbds-ctrl", s, ss)) >= 0 )
++ {
++ if ( !val )
++ setup_clear_cpu_cap(X86_FEATURE_SRBDS_CTRL);
++ }
+ else
+ rc = -EINVAL;
+
+diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
+index ccb316c547..256e58d82b 100644
+--- xen/arch/x86/msr.c.orig
++++ xen/arch/x86/msr.c
+@@ -154,6 +154,7 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
+ /* Write-only */
+ case MSR_TSX_FORCE_ABORT:
+ case MSR_TSX_CTRL:
++ case MSR_MCU_OPT_CTRL:
+ /* Not offered to guests. */
+ goto gp_fault;
+
+@@ -243,6 +244,7 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
+ /* Read-only */
+ case MSR_TSX_FORCE_ABORT:
+ case MSR_TSX_CTRL:
++ case MSR_MCU_OPT_CTRL:
+ /* Not offered to guests. */
+ goto gp_fault;
+
+diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
+index ab196b156d..94ab8dd786 100644
+--- xen/arch/x86/spec_ctrl.c.orig
++++ xen/arch/x86/spec_ctrl.c
+@@ -365,12 +365,13 @@ static void __init print_details(enum ind_thunk thunk, uint64_t caps)
+ printk("Speculative mitigation facilities:\n");
+
+ /* Hardware features which pertain to speculative mitigations. */
+- printk(" Hardware features:%s%s%s%s%s%s%s%s%s%s%s%s%s%s\n",
++ printk(" Hardware features:%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s\n",
+ (_7d0 & cpufeat_mask(X86_FEATURE_IBRSB)) ? " IBRS/IBPB" : "",
+ (_7d0 & cpufeat_mask(X86_FEATURE_STIBP)) ? " STIBP" : "",
+ (_7d0 & cpufeat_mask(X86_FEATURE_L1D_FLUSH)) ? " L1D_FLUSH" : "",
+ (_7d0 & cpufeat_mask(X86_FEATURE_SSBD)) ? " SSBD" : "",
+ (_7d0 & cpufeat_mask(X86_FEATURE_MD_CLEAR)) ? " MD_CLEAR" : "",
++ (_7d0 & cpufeat_mask(X86_FEATURE_SRBDS_CTRL)) ? " SRBDS_CTRL" : "",
+ (e8b & cpufeat_mask(X86_FEATURE_IBPB)) ? " IBPB" : "",
+ (caps & ARCH_CAPS_IBRS_ALL) ? " IBRS_ALL" : "",
+ (caps & ARCH_CAPS_RDCL_NO) ? " RDCL_NO" : "",
+diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
+index 1761a01f1f..480d1d8102 100644
+--- xen/include/asm-x86/msr-index.h.orig
++++ xen/include/asm-x86/msr-index.h
+@@ -177,6 +177,9 @@
+ #define MSR_IA32_VMX_TRUE_ENTRY_CTLS 0x490
+ #define MSR_IA32_VMX_VMFUNC 0x491
+
++#define MSR_MCU_OPT_CTRL 0x00000123
++#define MCU_OPT_CTRL_RNGDS_MITG_DIS (_AC(1, ULL) << 0)
++
+ /* K7/K8 MSRs. Not complete. See the architecture manual for a more
+ complete list. */
+ #define MSR_K7_EVNTSEL0 0xc0010000
+diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
+index a14d8a7013..9d210e74a0 100644
+--- xen/include/public/arch-x86/cpufeatureset.h.orig
++++ xen/include/public/arch-x86/cpufeatureset.h
+@@ -242,6 +242,7 @@ XEN_CPUFEATURE(IBPB, 8*32+12) /*A IBPB support only (no IBRS, used by
+ /* Intel-defined CPU features, CPUID level 0x00000007:0.edx, word 9 */
+ XEN_CPUFEATURE(AVX512_4VNNIW, 9*32+ 2) /*A AVX512 Neural Network Instructions */
+ XEN_CPUFEATURE(AVX512_4FMAPS, 9*32+ 3) /*A AVX512 Multiply Accumulation Single Precision */
++XEN_CPUFEATURE(SRBDS_CTRL, 9*32+ 9) /* MSR_MCU_OPT_CTRL and RNGDS_MITG_DIS. */
+ XEN_CPUFEATURE(MD_CLEAR, 9*32+10) /*A VERW clears microarchitectural buffers */
+ XEN_CPUFEATURE(TSX_FORCE_ABORT, 9*32+13) /* MSR_TSX_FORCE_ABORT.RTM_ABORT */
+ XEN_CPUFEATURE(IBRSB, 9*32+26) /*A IBRS and IBPB support (used by Intel) */
+From: Andrew Cooper <andrew.cooper3%citrix.com@localhost>
+Subject: x86/spec-ctrl: Mitigate the Special Register Buffer Data Sampling sidechannel
+
+See patch documentation and comments.
+
+This is part of XSA-320 / CVE-2020-0543
+
+Signed-off-by: Andrew Cooper <andrew.cooper3%citrix.com@localhost>
+Reviewed-by: Jan Beulich <jbeulich%suse.com@localhost>
+
+diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
+index 9be18ac99f..3356e59fee 100644
+--- docs/misc/xen-command-line.markdown.orig
++++ docs/misc/xen-command-line.markdown
+@@ -1858,7 +1858,7 @@ false disable the quirk workaround, which is also the default.
+ ### spec-ctrl (x86)
+ > `= List of [ <bool>, xen=<bool>, {pv,hvm,msr-sc,rsb,md-clear}=<bool>,
+ > bti-thunk=retpoline|lfence|jmp, {ibrs,ibpb,ssbd,eager-fpu,
+-> l1d-flush}=<bool> ]`
++> l1d-flush,srb-lock}=<bool> ]`
+
+ Controls for speculative execution sidechannel mitigations. By default, Xen
+ will pick the most appropriate mitigations based on compiled in support,
+@@ -1930,6 +1930,12 @@ Irrespective of Xen's setting, the feature is virtualised for HVM guests to
+ use. By default, Xen will enable this mitigation on hardware believed to be
+ vulnerable to L1TF.
+
++On hardware supporting SRBDS_CTRL, the `srb-lock=` option can be used to force
++or prevent Xen from protect the Special Register Buffer from leaking stale
++data. By default, Xen will enable this mitigation, except on parts where MDS
++is fixed and TAA is fixed/mitigated (in which case, there is believed to be no
++way for an attacker to obtain the stale data).
++
+ ### sync\_console
+ > `= <boolean>`
+
+diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
+index 4c12794809..30e1bd5cd3 100644
+--- xen/arch/x86/acpi/power.c.orig
++++ xen/arch/x86/acpi/power.c
+@@ -266,6 +266,9 @@ static int enter_state(u32 state)
+ ci->spec_ctrl_flags |= (default_spec_ctrl_flags & SCF_ist_wrmsr);
+ spec_ctrl_exit_idle(ci);
+
++ if ( boot_cpu_has(X86_FEATURE_SRBDS_CTRL) )
++ wrmsrl(MSR_MCU_OPT_CTRL, default_xen_mcu_opt_ctrl);
++
+ done:
+ spin_debug_enable();
+ local_irq_restore(flags);
+diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
+index 0887806e85..d24d215946 100644
+--- xen/arch/x86/smpboot.c.orig
++++ xen/arch/x86/smpboot.c
+@@ -369,12 +369,14 @@ void start_secondary(void *unused)
+ microcode_resume_cpu(cpu);
+
+ /*
+- * If MSR_SPEC_CTRL is available, apply Xen's default setting and discard
+- * any firmware settings. Note: MSR_SPEC_CTRL may only become available
+- * after loading microcode.
++ * If any speculative control MSRs are available, apply Xen's default
++ * settings. Note: These MSRs may only become available after loading
++ * microcode.
+ */
+ if ( boot_cpu_has(X86_FEATURE_IBRSB) )
+ wrmsrl(MSR_SPEC_CTRL, default_xen_spec_ctrl);
++ if ( boot_cpu_has(X86_FEATURE_SRBDS_CTRL) )
++ wrmsrl(MSR_MCU_OPT_CTRL, default_xen_mcu_opt_ctrl);
+
+ tsx_init(); /* Needs microcode. May change HLE/RTM feature bits. */
+
+diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
+index 94ab8dd786..a306d10c34 100644
+--- xen/arch/x86/spec_ctrl.c.orig
++++ xen/arch/x86/spec_ctrl.c
+@@ -63,6 +63,9 @@ static unsigned int __initdata l1d_maxphysaddr;
+ static bool __initdata cpu_has_bug_msbds_only; /* => minimal HT impact. */
+ static bool __initdata cpu_has_bug_mds; /* Any other M{LP,SB,FB}DS combination. */
+
++static int8_t __initdata opt_srb_lock = -1;
++uint64_t __read_mostly default_xen_mcu_opt_ctrl;
++
+ static int __init parse_bti(const char *s)
+ {
+ const char *ss;
+@@ -166,6 +169,7 @@ static int __init parse_spec_ctrl(const char *s)
+ opt_ibpb = false;
+ opt_ssbd = false;
+ opt_l1d_flush = 0;
++ opt_srb_lock = 0;
+ }
+ else if ( val > 0 )
+ rc = -EINVAL;
+@@ -231,6 +235,8 @@ static int __init parse_spec_ctrl(const char *s)
+ opt_eager_fpu = val;
+ else if ( (val = parse_boolean("l1d-flush", s, ss)) >= 0 )
+ opt_l1d_flush = val;
++ else if ( (val = parse_boolean("srb-lock", s, ss)) >= 0 )
++ opt_srb_lock = val;
+ else
+ rc = -EINVAL;
+
+@@ -394,7 +400,7 @@ static void __init print_details(enum ind_thunk thunk, uint64_t caps)
+ "\n");
+
+ /* Settings for Xen's protection, irrespective of guests. */
+- printk(" Xen settings: BTI-Thunk %s, SPEC_CTRL: %s%s%s, Other:%s%s%s\n",
++ printk(" Xen settings: BTI-Thunk %s, SPEC_CTRL: %s%s%s, Other:%s%s%s%s\n",
+ thunk == THUNK_NONE ? "N/A" :
+ thunk == THUNK_RETPOLINE ? "RETPOLINE" :
+ thunk == THUNK_LFENCE ? "LFENCE" :
+@@ -405,6 +411,8 @@ static void __init print_details(enum ind_thunk thunk, uint64_t caps)
+ (default_xen_spec_ctrl & SPEC_CTRL_SSBD) ? " SSBD+" : " SSBD-",
+ !(caps & ARCH_CAPS_TSX_CTRL) ? "" :
+ (opt_tsx & 1) ? " TSX+" : " TSX-",
++ !boot_cpu_has(X86_FEATURE_SRBDS_CTRL) ? "" :
++ opt_srb_lock ? " SRB_LOCK+" : " SRB_LOCK-",
+ opt_ibpb ? " IBPB" : "",
+ opt_l1d_flush ? " L1D_FLUSH" : "",
+ opt_md_clear_pv || opt_md_clear_hvm ? " VERW" : "");
+@@ -1196,6 +1204,34 @@ void __init init_speculation_mitigations(void)
+ tsx_init();
+ }
+
++ /* Calculate suitable defaults for MSR_MCU_OPT_CTRL */
++ if ( boot_cpu_has(X86_FEATURE_SRBDS_CTRL) )
++ {
++ uint64_t val;
++
++ rdmsrl(MSR_MCU_OPT_CTRL, val);
++
++ /*
++ * On some SRBDS-affected hardware, it may be safe to relax srb-lock
++ * by default.
++ *
++ * On parts which enumerate MDS_NO and not TAA_NO, TSX is the only way
++ * to access the Fill Buffer. If TSX isn't available (inc. SKU
++ * reasons on some models), or TSX is explicitly disabled, then there
++ * is no need for the extra overhead to protect RDRAND/RDSEED.
++ */
++ if ( opt_srb_lock == -1 &&
++ (caps & (ARCH_CAPS_MDS_NO|ARCH_CAPS_TAA_NO)) == ARCH_CAPS_MDS_NO &&
++ (!cpu_has_hle || ((caps & ARCH_CAPS_TSX_CTRL) && opt_tsx == 0)) )
++ opt_srb_lock = 0;
++
++ val &= ~MCU_OPT_CTRL_RNGDS_MITG_DIS;
++ if ( !opt_srb_lock )
++ val |= MCU_OPT_CTRL_RNGDS_MITG_DIS;
++
++ default_xen_mcu_opt_ctrl = val;
++ }
++
+ print_details(thunk, caps);
+
+ /*
+@@ -1227,6 +1263,9 @@ void __init init_speculation_mitigations(void)
+
+ wrmsrl(MSR_SPEC_CTRL, bsp_delay_spec_ctrl ? 0 : default_xen_spec_ctrl);
+ }
++
++ if ( boot_cpu_has(X86_FEATURE_SRBDS_CTRL) )
++ wrmsrl(MSR_MCU_OPT_CTRL, default_xen_mcu_opt_ctrl);
+ }
+
+ static void __init __maybe_unused build_assertions(void)
+diff --git a/xen/include/asm-x86/spec_ctrl.h b/xen/include/asm-x86/spec_ctrl.h
+index 333d180b7e..bf10d2ce5c 100644
+--- xen/include/asm-x86/spec_ctrl.h.orig
++++ xen/include/asm-x86/spec_ctrl.h
+@@ -46,6 +46,8 @@ extern int8_t opt_pv_l1tf_hwdom, opt_pv_l1tf_domu;
+ */
+ extern paddr_t l1tf_addr_mask, l1tf_safe_maddr;
+
++extern uint64_t default_xen_mcu_opt_ctrl;
++
+ static inline void init_shadow_spec_ctrl_state(void)
+ {
+ struct cpu_info *info = get_cpu_info();
+From: Andrew Cooper <andrew.cooper3%citrix.com@localhost>
+Subject: x86/spec-ctrl: Allow the RDRAND/RDSEED features to be hidden
+
+RDRAND/RDSEED can be hidden using cpuid= to mitigate SRBDS if microcode
+isn't available.
+
+This is part of XSA-320 / CVE-2020-0543.
+
+Signed-off-by: Andrew Cooper <andrew.cooper3%citrix.com@localhost>
+Acked-by: Julien Grall <jgrall%amazon.com@localhost>
+
+diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
+index 3356e59fee..ac397e7de0 100644
+--- docs/misc/xen-command-line.markdown.orig
++++ docs/misc/xen-command-line.markdown
+@@ -487,12 +487,18 @@ choice of `dom0-kernel` is deprecated and not supported by all Dom0 kernels.
+ This option allows for fine tuning of the facilities Xen will use, after
+ accounting for hardware capabilities as enumerated via CPUID.
+
++Unless otherwise noted, options only have any effect in their negative form,
++to hide the named feature(s). Ignoring a feature using this mechanism will
++cause Xen not to use the feature, nor offer them as usable to guests.
++
+ Currently accepted:
+
+ The Speculation Control hardware features `srbds-ctrl`, `md-clear`, `ibrsb`,
+ `stibp`, `ibpb`, `l1d-flush` and `ssbd` are used by default if available and
+-applicable. They can be ignored, e.g. `no-ibrsb`, at which point Xen won't
+-use them itself, and won't offer them to guests.
++applicable. They can all be ignored.
++
++`rdrand` and `rdseed` can be ignored, as a mitigation to XSA-320 /
++CVE-2020-0543.
+
+ ### cpuid\_mask\_cpu (AMD only)
+ > `= fam_0f_rev_c | fam_0f_rev_d | fam_0f_rev_e | fam_0f_rev_f | fam_0f_rev_g | fam_10_rev_b | fam_10_rev_c | fam_11_rev_b`
+diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
+index b8e5b6fe67..78d08dbb32 100644
+--- xen/arch/x86/cpuid.c.orig
++++ xen/arch/x86/cpuid.c
+@@ -63,6 +63,16 @@ static int __init parse_xen_cpuid(const char *s)
+ if ( !val )
+ setup_clear_cpu_cap(X86_FEATURE_SRBDS_CTRL);
+ }
++ else if ( (val = parse_boolean("rdrand", s, ss)) >= 0 )
++ {
++ if ( !val )
++ setup_clear_cpu_cap(X86_FEATURE_RDRAND);
++ }
++ else if ( (val = parse_boolean("rdseed", s, ss)) >= 0 )
++ {
++ if ( !val )
++ setup_clear_cpu_cap(X86_FEATURE_RDSEED);
++ }
+ else
+ rc = -EINVAL;
+
Index: pkgsrc/sysutils/xenkernel411/patches/patch-XSA321
diff -u /dev/null pkgsrc/sysutils/xenkernel411/patches/patch-XSA321:1.1
--- /dev/null Thu Jul 16 09:57:17 2020
+++ pkgsrc/sysutils/xenkernel411/patches/patch-XSA321 Thu Jul 16 09:57:17 2020
@@ -0,0 +1,586 @@
+$NetBSD: patch-XSA321,v 1.1 2020/07/16 09:57:17 bouyer Exp $
+
+From: Jan Beulich <jbeulich%suse.com@localhost>
+Subject: vtd: improve IOMMU TLB flush
+
+Do not limit PSI flushes to order 0 pages, in order to avoid doing a
+full TLB flush if the passed in page has an order greater than 0 and
+is aligned. Should increase the performance of IOMMU TLB flushes when
+dealing with page orders greater than 0.
+
+This is part of XSA-321.
+
+Signed-off-by: Jan Beulich <jbeulich%suse.com@localhost>
+
+--- xen/drivers/passthrough/vtd/iommu.c.orig
++++ xen/drivers/passthrough/vtd/iommu.c
+@@ -612,13 +612,14 @@ static int __must_check iommu_flush_iotl
+ if ( iommu_domid == -1 )
+ continue;
+
+- if ( page_count != 1 || gfn == gfn_x(INVALID_GFN) )
++ if ( !page_count || (page_count & (page_count - 1)) ||
++ gfn == gfn_x(INVALID_GFN) || !IS_ALIGNED(gfn, page_count) )
+ rc = iommu_flush_iotlb_dsi(iommu, iommu_domid,
+ 0, flush_dev_iotlb);
+ else
+ rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
+ (paddr_t)gfn << PAGE_SHIFT_4K,
+- PAGE_ORDER_4K,
++ get_order_from_pages(page_count),
+ !dma_old_pte_present,
+ flush_dev_iotlb);
+
+From: <security%xenproject.org@localhost>
+Subject: vtd: prune (and rename) cache flush functions
+
+Rename __iommu_flush_cache to iommu_sync_cache and remove
+iommu_flush_cache_page. Also remove the iommu_flush_cache_entry
+wrapper and just use iommu_sync_cache instead. Note the _entry suffix
+was meaningless as the wrapper was already taking a size parameter in
+bytes. While there also constify the addr parameter.
+
+No functional change intended.
+
+This is part of XSA-321.
+
+Reviewed-by: Jan Beulich <jbeulich%suse.com@localhost>
+
+--- xen/drivers/passthrough/vtd/extern.h.orig
++++ xen/drivers/passthrough/vtd/extern.h
+@@ -37,8 +37,7 @@ void disable_qinval(struct iommu *iommu)
+ int enable_intremap(struct iommu *iommu, int eim);
+ void disable_intremap(struct iommu *iommu);
+
+-void iommu_flush_cache_entry(void *addr, unsigned int size);
+-void iommu_flush_cache_page(void *addr, unsigned long npages);
++void iommu_sync_cache(const void *addr, unsigned int size);
+ int iommu_alloc(struct acpi_drhd_unit *drhd);
+ void iommu_free(struct acpi_drhd_unit *drhd);
+
+--- xen/drivers/passthrough/vtd/intremap.c.orig
++++ xen/drivers/passthrough/vtd/intremap.c
+@@ -231,7 +231,7 @@ static void free_remap_entry(struct iomm
+ iremap_entries, iremap_entry);
+
+ update_irte(iommu, iremap_entry, &new_ire, false);
+- iommu_flush_cache_entry(iremap_entry, sizeof(*iremap_entry));
++ iommu_sync_cache(iremap_entry, sizeof(*iremap_entry));
+ iommu_flush_iec_index(iommu, 0, index);
+
+ unmap_vtd_domain_page(iremap_entries);
+@@ -403,7 +403,7 @@ static int ioapic_rte_to_remap_entry(str
+ }
+
+ update_irte(iommu, iremap_entry, &new_ire, !init);
+- iommu_flush_cache_entry(iremap_entry, sizeof(*iremap_entry));
++ iommu_sync_cache(iremap_entry, sizeof(*iremap_entry));
+ iommu_flush_iec_index(iommu, 0, index);
+
+ unmap_vtd_domain_page(iremap_entries);
+@@ -694,7 +694,7 @@ static int msi_msg_to_remap_entry(
+ update_irte(iommu, iremap_entry, &new_ire, msi_desc->irte_initialized);
+ msi_desc->irte_initialized = true;
+
+- iommu_flush_cache_entry(iremap_entry, sizeof(*iremap_entry));
++ iommu_sync_cache(iremap_entry, sizeof(*iremap_entry));
+ iommu_flush_iec_index(iommu, 0, index);
+
+ unmap_vtd_domain_page(iremap_entries);
+--- xen/drivers/passthrough/vtd/iommu.c.orig
++++ xen/drivers/passthrough/vtd/iommu.c
+@@ -158,7 +158,8 @@ static void __init free_intel_iommu(stru
+ }
+
+ static int iommus_incoherent;
+-static void __iommu_flush_cache(void *addr, unsigned int size)
++
++void iommu_sync_cache(const void *addr, unsigned int size)
+ {
+ int i;
+ static unsigned int clflush_size = 0;
+@@ -173,16 +174,6 @@ static void __iommu_flush_cache(void *ad
+ cacheline_flush((char *)addr + i);
+ }
+
+-void iommu_flush_cache_entry(void *addr, unsigned int size)
+-{
+- __iommu_flush_cache(addr, size);
+-}
+-
+-void iommu_flush_cache_page(void *addr, unsigned long npages)
+-{
+- __iommu_flush_cache(addr, PAGE_SIZE * npages);
+-}
+-
+ /* Allocate page table, return its machine address */
+ u64 alloc_pgtable_maddr(struct acpi_drhd_unit *drhd, unsigned long npages)
+ {
+@@ -207,7 +198,7 @@ u64 alloc_pgtable_maddr(struct acpi_drhd
+ vaddr = __map_domain_page(cur_pg);
+ memset(vaddr, 0, PAGE_SIZE);
+
+- iommu_flush_cache_page(vaddr, 1);
++ iommu_sync_cache(vaddr, PAGE_SIZE);
+ unmap_domain_page(vaddr);
+ cur_pg++;
+ }
+@@ -242,7 +233,7 @@ static u64 bus_to_context_maddr(struct i
+ }
+ set_root_value(*root, maddr);
+ set_root_present(*root);
+- iommu_flush_cache_entry(root, sizeof(struct root_entry));
++ iommu_sync_cache(root, sizeof(struct root_entry));
+ }
+ maddr = (u64) get_context_addr(*root);
+ unmap_vtd_domain_page(root_entries);
+@@ -300,7 +291,7 @@ static u64 addr_to_dma_page_maddr(struct
+ */
+ dma_set_pte_readable(*pte);
+ dma_set_pte_writable(*pte);
+- iommu_flush_cache_entry(pte, sizeof(struct dma_pte));
++ iommu_sync_cache(pte, sizeof(struct dma_pte));
+ }
+
+ if ( level == 2 )
+@@ -674,7 +665,7 @@ static int __must_check dma_pte_clear_on
+
+ dma_clear_pte(*pte);
+ spin_unlock(&hd->arch.mapping_lock);
+- iommu_flush_cache_entry(pte, sizeof(struct dma_pte));
++ iommu_sync_cache(pte, sizeof(struct dma_pte));
+
+ if ( !this_cpu(iommu_dont_flush_iotlb) )
+ rc = iommu_flush_iotlb_pages(domain, addr >> PAGE_SHIFT_4K, 1);
+@@ -716,7 +707,7 @@ static void iommu_free_page_table(struct
+ iommu_free_pagetable(dma_pte_addr(*pte), next_level);
+
+ dma_clear_pte(*pte);
+- iommu_flush_cache_entry(pte, sizeof(struct dma_pte));
++ iommu_sync_cache(pte, sizeof(struct dma_pte));
+ }
+
+ unmap_vtd_domain_page(pt_vaddr);
+@@ -1449,7 +1440,7 @@ int domain_context_mapping_one(
+ context_set_address_width(*context, agaw);
+ context_set_fault_enable(*context);
+ context_set_present(*context);
+- iommu_flush_cache_entry(context, sizeof(struct context_entry));
++ iommu_sync_cache(context, sizeof(struct context_entry));
+ spin_unlock(&iommu->lock);
+
+ /* Context entry was previously non-present (with domid 0). */
+@@ -1602,7 +1593,7 @@ int domain_context_unmap_one(
+
+ context_clear_present(*context);
+ context_clear_entry(*context);
+- iommu_flush_cache_entry(context, sizeof(struct context_entry));
++ iommu_sync_cache(context, sizeof(struct context_entry));
+
+ iommu_domid= domain_iommu_domid(domain, iommu);
+ if ( iommu_domid == -1 )
+@@ -1828,7 +1819,7 @@ static int __must_check intel_iommu_map_
+
+ *pte = new;
+
+- iommu_flush_cache_entry(pte, sizeof(struct dma_pte));
++ iommu_sync_cache(pte, sizeof(struct dma_pte));
+ spin_unlock(&hd->arch.mapping_lock);
+ unmap_vtd_domain_page(page);
+
+@@ -1862,7 +1853,7 @@ int iommu_pte_flush(struct domain *d, u6
+ int iommu_domid;
+ int rc = 0;
+
+- iommu_flush_cache_entry(pte, sizeof(struct dma_pte));
++ iommu_sync_cache(pte, sizeof(struct dma_pte));
+
+ for_each_drhd_unit ( drhd )
+ {
+From: <security%xenproject.org@localhost>
+Subject: x86/iommu: introduce a cache sync hook
+
+The hook is only implemented for VT-d and it uses the already existing
+iommu_sync_cache function present in VT-d code. The new hook is
+added so that the cache can be flushed by code outside of VT-d when
+using shared page tables.
+
+Note that alloc_pgtable_maddr must use the now locally defined
+sync_cache function, because IOMMU ops are not yet setup the first
+time the function gets called during IOMMU initialization.
+
+No functional change intended.
+
+This is part of XSA-321.
+
+Reviewed-by: Jan Beulich <jbeulich%suse.com@localhost>
+
+--- xen/drivers/passthrough/vtd/extern.h.orig
++++ xen/drivers/passthrough/vtd/extern.h
+@@ -37,7 +37,6 @@ void disable_qinval(struct iommu *iommu)
+ int enable_intremap(struct iommu *iommu, int eim);
+ void disable_intremap(struct iommu *iommu);
+
+-void iommu_sync_cache(const void *addr, unsigned int size);
+ int iommu_alloc(struct acpi_drhd_unit *drhd);
+ void iommu_free(struct acpi_drhd_unit *drhd);
+
+--- xen/drivers/passthrough/vtd/iommu.c.orig
++++ xen/drivers/passthrough/vtd/iommu.c
+@@ -159,7 +159,7 @@ static void __init free_intel_iommu(stru
+
+ static int iommus_incoherent;
+
+-void iommu_sync_cache(const void *addr, unsigned int size)
++static void sync_cache(const void *addr, unsigned int size)
+ {
+ int i;
+ static unsigned int clflush_size = 0;
+@@ -198,7 +198,7 @@ u64 alloc_pgtable_maddr(struct acpi_drhd
+ vaddr = __map_domain_page(cur_pg);
+ memset(vaddr, 0, PAGE_SIZE);
+
+- iommu_sync_cache(vaddr, PAGE_SIZE);
++ sync_cache(vaddr, PAGE_SIZE);
+ unmap_domain_page(vaddr);
+ cur_pg++;
+ }
+@@ -2760,6 +2760,7 @@ const struct iommu_ops intel_iommu_ops =
+ .iotlb_flush_all = iommu_flush_iotlb_all,
+ .get_reserved_device_memory = intel_iommu_get_reserved_device_memory,
+ .dump_p2m_table = vtd_dump_p2m_table,
++ .sync_cache = sync_cache,
+ };
+
+ /*
+--- xen/include/asm-x86/iommu.h.orig
++++ xen/include/asm-x86/iommu.h
+@@ -98,6 +98,13 @@ extern bool untrusted_msi;
+ int pi_update_irte(const struct pi_desc *pi_desc, const struct pirq *pirq,
+ const uint8_t gvec);
+
++#define iommu_sync_cache(addr, size) ({ \
++ const struct iommu_ops *ops = iommu_get_ops(); \
++ \
++ if ( ops->sync_cache ) \
++ ops->sync_cache(addr, size); \
++})
++
+ #endif /* !__ARCH_X86_IOMMU_H__ */
+ /*
+ * Local variables:
+--- xen/include/xen/iommu.h.orig
++++ xen/include/xen/iommu.h
+@@ -161,6 +161,7 @@ struct iommu_ops {
+ void (*update_ire_from_apic)(unsigned int apic, unsigned int reg, unsigned int value);
+ unsigned int (*read_apic_from_ire)(unsigned int apic, unsigned int reg);
+ int (*setup_hpet_msi)(struct msi_desc *);
++ void (*sync_cache)(const void *addr, unsigned int size);
+ #endif /* CONFIG_X86 */
+ int __must_check (*suspend)(void);
+ void (*resume)(void);
+From: <security%xenproject.org@localhost>
+Subject: vtd: don't assume addresses are aligned in sync_cache
+
+Current code in sync_cache assume that the address passed in is
+aligned to a cache line size. Fix the code to support passing in
+arbitrary addresses not necessarily aligned to a cache line size.
+
+This is part of XSA-321.
+
+Reviewed-by: Jan Beulich <jbeulich%suse.com@localhost>
+
+--- xen/drivers/passthrough/vtd/iommu.c.orig
++++ xen/drivers/passthrough/vtd/iommu.c
+@@ -161,8 +161,8 @@ static int iommus_incoherent;
+
+ static void sync_cache(const void *addr, unsigned int size)
+ {
+- int i;
+- static unsigned int clflush_size = 0;
++ static unsigned long clflush_size = 0;
++ const void *end = addr + size;
+
+ if ( !iommus_incoherent )
+ return;
+@@ -170,8 +170,9 @@ static void sync_cache(const void *addr,
+ if ( clflush_size == 0 )
+ clflush_size = get_cache_line_size();
+
+- for ( i = 0; i < size; i += clflush_size )
+- cacheline_flush((char *)addr + i);
++ addr -= (unsigned long)addr & (clflush_size - 1);
++ for ( ; addr < end; addr += clflush_size )
++ cacheline_flush((char *)addr);
+ }
+
+ /* Allocate page table, return its machine address */
+From: <security%xenproject.org@localhost>
+Subject: x86/alternative: introduce alternative_2
+
+It's based on alternative_io_2 without inputs or outputs but with an
+added memory clobber.
+
+This is part of XSA-321.
+
+Acked-by: Jan Beulich <jbeulich%suse.com@localhost>
+
+--- xen/include/asm-x86/alternative.h.orig
++++ xen/include/asm-x86/alternative.h
+@@ -113,6 +113,11 @@ extern void alternative_instructions(voi
+ #define alternative(oldinstr, newinstr, feature) \
+ asm volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")
+
++#define alternative_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \
++ asm volatile (ALTERNATIVE_2(oldinstr, newinstr1, feature1, \
++ newinstr2, feature2) \
++ : : : "memory")
++
+ /*
+ * Alternative inline assembly with input.
+ *
+From: <security%xenproject.org@localhost>
+Subject: vtd: optimize CPU cache sync
+
+Some VT-d IOMMUs are non-coherent, which requires a cache write back
+in order for the changes made by the CPU to be visible to the IOMMU.
+This cache write back was unconditionally done using clflush, but there are
+other more efficient instructions to do so, hence implement support
+for them using the alternative framework.
+
+This is part of XSA-321.
+
+Reviewed-by: Jan Beulich <jbeulich%suse.com@localhost>
+
+--- xen/drivers/passthrough/vtd/extern.h.orig
++++ xen/drivers/passthrough/vtd/extern.h
+@@ -63,7 +63,6 @@ int __must_check qinval_device_iotlb_syn
+ u16 did, u16 size, u64 addr);
+
+ unsigned int get_cache_line_size(void);
+-void cacheline_flush(char *);
+ void flush_all_cache(void);
+
+ u64 alloc_pgtable_maddr(struct acpi_drhd_unit *drhd, unsigned long npages);
+--- xen/drivers/passthrough/vtd/iommu.c.orig
++++ xen/drivers/passthrough/vtd/iommu.c
+@@ -31,6 +31,7 @@
+ #include <xen/pci_regs.h>
+ #include <xen/keyhandler.h>
+ #include <asm/msi.h>
++#include <asm/nops.h>
+ #include <asm/irq.h>
+ #include <asm/hvm/vmx/vmx.h>
+ #include <asm/p2m.h>
+@@ -172,7 +173,42 @@ static void sync_cache(const void *addr,
+
+ addr -= (unsigned long)addr & (clflush_size - 1);
+ for ( ; addr < end; addr += clflush_size )
+- cacheline_flush((char *)addr);
++/*
++ * The arguments to a macro must not include preprocessor directives. Doing so
++ * results in undefined behavior, so we have to create some defines here in
++ * order to avoid it.
++ */
++#if defined(HAVE_AS_CLWB)
++# define CLWB_ENCODING "clwb %[p]"
++#elif defined(HAVE_AS_XSAVEOPT)
++# define CLWB_ENCODING "data16 xsaveopt %[p]" /* clwb */
++#else
++# define CLWB_ENCODING ".byte 0x66, 0x0f, 0xae, 0x30" /* clwb (%%rax) */
++#endif
++
++#define BASE_INPUT(addr) [p] "m" (*(const char *)(addr))
++#if defined(HAVE_AS_CLWB) || defined(HAVE_AS_XSAVEOPT)
++# define INPUT BASE_INPUT
++#else
++# define INPUT(addr) "a" (addr), BASE_INPUT(addr)
++#endif
++ /*
++ * Note regarding the use of NOP_DS_PREFIX: it's faster to do a clflush
++ * + prefix than a clflush + nop, and hence the prefix is added instead
++ * of letting the alternative framework fill the gap by appending nops.
++ */
++ alternative_io_2(".byte " __stringify(NOP_DS_PREFIX) "; clflush %[p]",
++ "data16 clflush %[p]", /* clflushopt */
++ X86_FEATURE_CLFLUSHOPT,
++ CLWB_ENCODING,
++ X86_FEATURE_CLWB, /* no outputs */,
++ INPUT(addr));
++#undef INPUT
++#undef BASE_INPUT
++#undef CLWB_ENCODING
++
++ alternative_2("", "sfence", X86_FEATURE_CLFLUSHOPT,
++ "sfence", X86_FEATURE_CLWB);
+ }
+
+ /* Allocate page table, return its machine address */
+--- xen/drivers/passthrough/vtd/x86/vtd.c.orig
++++ xen/drivers/passthrough/vtd/x86/vtd.c
+@@ -53,11 +53,6 @@ unsigned int get_cache_line_size(void)
+ return ((cpuid_ebx(1) >> 8) & 0xff) * 8;
+ }
+
+-void cacheline_flush(char * addr)
+-{
+- clflush(addr);
+-}
+-
+ void flush_all_cache()
+ {
+ wbinvd();
+From: <security%xenproject.org@localhost>
+Subject: x86/ept: flush cache when modifying PTEs and sharing page tables
+
+Modifications made to the page tables by EPT code need to be written
+to memory when the page tables are shared with the IOMMU, as Intel
+IOMMUs can be non-coherent and thus require changes to be written to
+memory in order to be visible to the IOMMU.
+
+In order to achieve this make sure data is written back to memory
+after writing an EPT entry when the recalc bit is not set in
+atomic_write_ept_entry. If such bit is set, the entry will be
+adjusted and atomic_write_ept_entry will be called a second time
+without the recalc bit set. Note that when splitting a super page the
+new tables resulting of the split should also be written back.
+
+Failure to do so can allow devices behind the IOMMU access to the
+stale super page, or cause coherency issues as changes made by the
+processor to the page tables are not visible to the IOMMU.
+
+This allows to remove the VT-d specific iommu_pte_flush helper, since
+the cache write back is now performed by atomic_write_ept_entry, and
+hence iommu_iotlb_flush can be used to flush the IOMMU TLB. The newly
+used method (iommu_iotlb_flush) can result in less flushes, since it
+might sometimes be called rightly with 0 flags, in which case it
+becomes a no-op.
+
+This is part of XSA-321.
+
+Reviewed-by: Jan Beulich <jbeulich%suse.com@localhost>
+
+--- xen/arch/x86/mm/p2m-ept.c.orig
++++ xen/arch/x86/mm/p2m-ept.c
+@@ -90,6 +90,19 @@ static int atomic_write_ept_entry(ept_en
+
+ write_atomic(&entryptr->epte, new.epte);
+
++ /*
++ * The recalc field on the EPT is used to signal either that a
++ * recalculation of the EMT field is required (which doesn't effect the
++ * IOMMU), or a type change. Type changes can only be between ram_rw,
++ * logdirty and ioreq_server: changes to/from logdirty won't work well with
++ * an IOMMU anyway, as IOMMU #PFs are not synchronous and will lead to
++ * aborts, and changes to/from ioreq_server are already fully flushed
++ * before returning to guest context (see
++ * XEN_DMOP_map_mem_type_to_ioreq_server).
++ */
++ if ( !new.recalc && iommu_hap_pt_share )
++ iommu_sync_cache(entryptr, sizeof(*entryptr));
++
+ if ( unlikely(oldmfn != mfn_x(INVALID_MFN)) )
+ put_page(mfn_to_page(_mfn(oldmfn)));
+
+@@ -319,6 +332,9 @@ static bool_t ept_split_super_page(struc
+ break;
+ }
+
++ if ( iommu_hap_pt_share )
++ iommu_sync_cache(table, EPT_PAGETABLE_ENTRIES * sizeof(ept_entry_t));
++
+ unmap_domain_page(table);
+
+ /* Even failed we should install the newly allocated ept page. */
+@@ -875,7 +894,7 @@ out:
+ need_modify_vtd_table )
+ {
+ if ( iommu_hap_pt_share )
+- rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
++ rc = iommu_flush_iotlb(d, gfn, vtd_pte_present, 1u << order);
+ else
+ {
+ if ( iommu_flags )
+--- xen/drivers/passthrough/vtd/iommu.c.orig
++++ xen/drivers/passthrough/vtd/iommu.c
+@@ -612,10 +612,8 @@ static int __must_check iommu_flush_all(
+ return rc;
+ }
+
+-static int __must_check iommu_flush_iotlb(struct domain *d,
+- unsigned long gfn,
+- bool_t dma_old_pte_present,
+- unsigned int page_count)
++int iommu_flush_iotlb(struct domain *d, unsigned long gfn,
++ bool dma_old_pte_present, unsigned int page_count)
+ {
+ struct domain_iommu *hd = dom_iommu(d);
+ struct acpi_drhd_unit *drhd;
+@@ -1880,53 +1878,6 @@ static int __must_check intel_iommu_unma
+ return dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K);
+ }
+
+-int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
+- int order, int present)
+-{
+- struct acpi_drhd_unit *drhd;
+- struct iommu *iommu = NULL;
+- struct domain_iommu *hd = dom_iommu(d);
+- bool_t flush_dev_iotlb;
+- int iommu_domid;
+- int rc = 0;
+-
+- iommu_sync_cache(pte, sizeof(struct dma_pte));
+-
+- for_each_drhd_unit ( drhd )
+- {
+- iommu = drhd->iommu;
+- if ( !test_bit(iommu->index, &hd->arch.iommu_bitmap) )
+- continue;
+-
+- flush_dev_iotlb = !!find_ats_dev_drhd(iommu);
+- iommu_domid= domain_iommu_domid(d, iommu);
+- if ( iommu_domid == -1 )
+- continue;
+-
+- rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
+- (paddr_t)gfn << PAGE_SHIFT_4K,
+- order, !present, flush_dev_iotlb);
+- if ( rc > 0 )
+- {
+- iommu_flush_write_buffer(iommu);
+- rc = 0;
+- }
+- }
+-
+- if ( unlikely(rc) )
+- {
+- if ( !d->is_shutting_down && printk_ratelimit() )
+- printk(XENLOG_ERR VTDPREFIX
+- " d%d: IOMMU pages flush failed: %d\n",
+- d->domain_id, rc);
+-
+- if ( !is_hardware_domain(d) )
+- domain_crash(d);
+- }
+-
+- return rc;
+-}
+-
+ static int __init vtd_ept_page_compatible(struct iommu *iommu)
+ {
+ u64 ept_cap, vtd_cap = iommu->cap;
+--- xen/include/asm-x86/iommu.h.orig
++++ xen/include/asm-x86/iommu.h
+@@ -87,8 +87,9 @@ int iommu_setup_hpet_msi(struct msi_desc
+
+ /* While VT-d specific, this must get declared in a generic header. */
+ int adjust_vtd_irq_affinities(void);
+-int __must_check iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
+- int order, int present);
++int __must_check iommu_flush_iotlb(struct domain *d, unsigned long gfn,
++ bool dma_old_pte_present,
++ unsigned int page_count);
+ bool_t iommu_supports_eim(void);
+ int iommu_enable_x2apic_IR(void);
+ void iommu_disable_x2apic_IR(void);
Index: pkgsrc/sysutils/xenkernel411/patches/patch-XSA328
diff -u /dev/null pkgsrc/sysutils/xenkernel411/patches/patch-XSA328:1.1
--- /dev/null Thu Jul 16 09:57:17 2020
+++ pkgsrc/sysutils/xenkernel411/patches/patch-XSA328 Thu Jul 16 09:57:17 2020
@@ -0,0 +1,213 @@
+$NetBSD: patch-XSA328,v 1.1 2020/07/16 09:57:17 bouyer Exp $
+
+From: Jan Beulich <jbeulich%suse.com@localhost>
+Subject: x86/EPT: ept_set_middle_entry() related adjustments
+
+ept_split_super_page() wants to further modify the newly allocated
+table, so have ept_set_middle_entry() return the mapped pointer rather
+than tearing it down and then getting re-established right again.
+
+Similarly ept_next_level() wants to hand back a mapped pointer of
+the next level page, so re-use the one established by
+ept_set_middle_entry() in case that path was taken.
+
+Pull the setting of suppress_ve ahead of insertion into the higher level
+table, and don't have ept_split_super_page() set the field a 2nd time.
+
+This is part of XSA-328.
+
+Signed-off-by: Jan Beulich <jbeulich%suse.com@localhost>
+
+--- xen/arch/x86/mm/p2m-ept.c.orig
++++ xen/arch/x86/mm/p2m-ept.c
+@@ -228,8 +228,9 @@ static void ept_p2m_type_to_flags(struct
+ #define GUEST_TABLE_SUPER_PAGE 2
+ #define GUEST_TABLE_POD_PAGE 3
+
+-/* Fill in middle levels of ept table */
+-static int ept_set_middle_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry)
++/* Fill in middle level of ept table; return pointer to mapped new table. */
++static ept_entry_t *ept_set_middle_entry(struct p2m_domain *p2m,
++ ept_entry_t *ept_entry)
+ {
+ mfn_t mfn;
+ ept_entry_t *table;
+@@ -237,7 +238,12 @@ static int ept_set_middle_entry(struct p
+
+ mfn = p2m_alloc_ptp(p2m, 0);
+ if ( mfn_eq(mfn, INVALID_MFN) )
+- return 0;
++ return NULL;
++
++ table = map_domain_page(mfn);
++
++ for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
++ table[i].suppress_ve = 1;
+
+ ept_entry->epte = 0;
+ ept_entry->mfn = mfn_x(mfn);
+@@ -249,14 +255,7 @@ static int ept_set_middle_entry(struct p
+
+ ept_entry->suppress_ve = 1;
+
+- table = map_domain_page(mfn);
+-
+- for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
+- table[i].suppress_ve = 1;
+-
+- unmap_domain_page(table);
+-
+- return 1;
++ return table;
+ }
+
+ /* free ept sub tree behind an entry */
+@@ -294,10 +293,10 @@ static bool_t ept_split_super_page(struc
+
+ ASSERT(is_epte_superpage(ept_entry));
+
+- if ( !ept_set_middle_entry(p2m, &new_ept) )
++ table = ept_set_middle_entry(p2m, &new_ept);
++ if ( !table )
+ return 0;
+
+- table = map_domain_page(_mfn(new_ept.mfn));
+ trunk = 1UL << ((level - 1) * EPT_TABLE_ORDER);
+
+ for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
+@@ -308,7 +307,6 @@ static bool_t ept_split_super_page(struc
+ epte->sp = (level > 1);
+ epte->mfn += i * trunk;
+ epte->snp = (iommu_enabled && iommu_snoop);
+- epte->suppress_ve = 1;
+
+ ept_p2m_type_to_flags(p2m, epte, epte->sa_p2mt, epte->access);
+
+@@ -347,8 +345,7 @@ static int ept_next_level(struct p2m_dom
+ ept_entry_t **table, unsigned long *gfn_remainder,
+ int next_level)
+ {
+- unsigned long mfn;
+- ept_entry_t *ept_entry, e;
++ ept_entry_t *ept_entry, *next = NULL, e;
+ u32 shift, index;
+
+ shift = next_level * EPT_TABLE_ORDER;
+@@ -373,19 +370,17 @@ static int ept_next_level(struct p2m_dom
+ if ( read_only )
+ return GUEST_TABLE_MAP_FAILED;
+
+- if ( !ept_set_middle_entry(p2m, ept_entry) )
++ next = ept_set_middle_entry(p2m, ept_entry);
++ if ( !next )
+ return GUEST_TABLE_MAP_FAILED;
+- else
+- e = atomic_read_ept_entry(ept_entry); /* Refresh */
++ /* e is now stale and hence may not be used anymore below. */
+ }
+-
+ /* The only time sp would be set here is if we had hit a superpage */
+- if ( is_epte_superpage(&e) )
++ else if ( is_epte_superpage(&e) )
+ return GUEST_TABLE_SUPER_PAGE;
+
+- mfn = e.mfn;
+ unmap_domain_page(*table);
+- *table = map_domain_page(_mfn(mfn));
++ *table = next ?: map_domain_page(_mfn(e.mfn));
+ *gfn_remainder &= (1UL << shift) - 1;
+ return GUEST_TABLE_NORMAL_PAGE;
+ }
+From: <security%xenproject.org@localhost>
+Subject: x86/ept: atomically modify entries in ept_next_level
+
+ept_next_level was passing a live PTE pointer to ept_set_middle_entry,
+which was then modified without taking into account that the PTE could
+be part of a live EPT table. This wasn't a security issue because the
+pages returned by p2m_alloc_ptp are zeroed, so adding such an entry
+before actually initializing it didn't allow a guest to access
+physical memory addresses it wasn't supposed to access.
+
+This is part of XSA-328.
+
+Reviewed-by: Jan Beulich <jbeulich%suse.com@localhost>
+
+--- xen/arch/x86/mm/p2m-ept.c.orig
++++ xen/arch/x86/mm/p2m-ept.c
+@@ -348,6 +348,8 @@ static int ept_next_level(struct p2m_dom
+ ept_entry_t *ept_entry, *next = NULL, e;
+ u32 shift, index;
+
++ ASSERT(next_level);
++
+ shift = next_level * EPT_TABLE_ORDER;
+
+ index = *gfn_remainder >> shift;
+@@ -364,16 +366,20 @@ static int ept_next_level(struct p2m_dom
+
+ if ( !is_epte_present(&e) )
+ {
++ int rc;
++
+ if ( e.sa_p2mt == p2m_populate_on_demand )
+ return GUEST_TABLE_POD_PAGE;
+
+ if ( read_only )
+ return GUEST_TABLE_MAP_FAILED;
+
+- next = ept_set_middle_entry(p2m, ept_entry);
++ next = ept_set_middle_entry(p2m, &e);
+ if ( !next )
+ return GUEST_TABLE_MAP_FAILED;
+- /* e is now stale and hence may not be used anymore below. */
++
++ rc = atomic_write_ept_entry(ept_entry, e, next_level);
++ ASSERT(rc == 0);
+ }
+ /* The only time sp would be set here is if we had hit a superpage */
+ else if ( is_epte_superpage(&e) )
+
+this has to be applied after patch-XSA328
+
+From: <security%xenproject.org@localhost>
+Subject: x86/ept: flush cache when modifying PTEs and sharing page tables
+
+Modifications made to the page tables by EPT code need to be written
+to memory when the page tables are shared with the IOMMU, as Intel
+IOMMUs can be non-coherent and thus require changes to be written to
+memory in order to be visible to the IOMMU.
+
+In order to achieve this make sure data is written back to memory
+after writing an EPT entry when the recalc bit is not set in
+atomic_write_ept_entry. If such bit is set, the entry will be
+adjusted and atomic_write_ept_entry will be called a second time
+without the recalc bit set. Note that when splitting a super page the
+new tables resulting of the split should also be written back.
+
+Failure to do so can allow devices behind the IOMMU access to the
+stale super page, or cause coherency issues as changes made by the
+processor to the page tables are not visible to the IOMMU.
+
+This allows to remove the VT-d specific iommu_pte_flush helper, since
+the cache write back is now performed by atomic_write_ept_entry, and
+hence iommu_iotlb_flush can be used to flush the IOMMU TLB. The newly
+used method (iommu_iotlb_flush) can result in less flushes, since it
+might sometimes be called rightly with 0 flags, in which case it
+becomes a no-op.
+
+This is part of XSA-321.
+
+Reviewed-by: Jan Beulich <jbeulich%suse.com@localhost>
+
+--- xen/arch/x86/mm/p2m-ept.c.orig
++++ xen/arch/x86/mm/p2m-ept.c
+@@ -394,6 +394,9 @@
+ if ( !next )
+ return GUEST_TABLE_MAP_FAILED;
+
++ if ( iommu_hap_pt_share )
++ iommu_sync_cache(next, EPT_PAGETABLE_ENTRIES * sizeof(ept_entry_t));
++
+ rc = atomic_write_ept_entry(ept_entry, e, next_level);
+ ASSERT(rc == 0);
+ }
Home |
Main Index |
Thread Index |
Old Index