On Wed, Nov 11, 2020 at 11:29:37PM +0100, Alexander Graf wrote:
On 11.11.20 23:15, Joel Fernandes wrote:
On Wed, Nov 11, 2020 at 5:13 PM Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
On Wed, Nov 11, 2020 at 5:00 PM Alexander Graf <graf@xxxxxxxxxx> wrote:
On 11.11.20 22:14, Joel Fernandes wrote:
Some hardware such as certain AMD variants don't have cross-HT MDS/L1TF
issues. Detect this and don't enable core scheduling as it can
needlessly slow the device done.
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index dece79e4d1e9..0e6e61e49b23 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -152,6 +152,14 @@ void __init check_bugs(void)
#endif
}
+/*
+ * Do not need core scheduling if CPU does not have MDS/L1TF vulnerability.
+ */
+int arch_allow_core_sched(void)
+{
+ return boot_cpu_has_bug(X86_BUG_MDS) || boot_cpu_has_bug(X86_BUG_L1TF);
Can we make this more generic and user settable, similar to the L1 cache
flushing modes in KVM?
I am not 100% convinced that there are no other thread sibling attacks
possible without MDS and L1TF. If I'm paranoid, I want to still be able
to force enable core scheduling.
In addition, we are also using core scheduling as a poor man's mechanism
to give customers consistent performance for virtual machine thread
siblings. This is important irrespective of CPU bugs. In such a
scenario, I want to force enable core scheduling.
Ok, I can make it new kernel command line option with:
coresched=on
coresched=secure (only if HW has MDS/L1TF)
coresched=off
Also, I would keep "secure" as the default. (And probably, we should
modify the informational messages in sysfs to reflect this..)
I agree that "secure" should be the default.
Ok.
Can we also integrate into the "mitigations" kernel command line[1] for this?
Sure, the integration into [1] sounds conceptually fine to me however it is
not super straight forward. Like: What if user wants to force-enable
core-scheduling for the usecase you mention, but still wants the cross-HT
mitigation because they are only tagging VMs (as in your usecase) and not
other tasks. Idk.
The best thing to do could be to keep the "auto disable HT" controls and
logic separate from the "coresched=on" logic and let the user choose. The
exception being, coresched=secure means that on HW that does not have
vulnerability, we will not activate the core scheduling.