Re: [PATCH v3 64/75] x86/sev-es: Cache CPUID results for improved performance

From: Tom Lendacky
Date: Wed May 06 2020 - 19:03:09 EST




On 5/6/20 1:08 PM, Mike Stunes wrote:


On Apr 28, 2020, at 8:17 AM, Joerg Roedel <joro@xxxxxxxxxx> wrote:

From: Mike Stunes <mstunes@xxxxxxxxxx>

To avoid a future VMEXIT for a subsequent CPUID function, cache the
results returned by CPUID into an xarray.

[tl: coding standard changes, register zero extension]

Signed-off-by: Mike Stunes <mstunes@xxxxxxxxxx>
Signed-off-by: Tom Lendacky <thomas.lendacky@xxxxxxx>
[ jroedel@xxxxxxx: - Wrapped cache handling into vc_handle_cpuid_cached()
- Used lower_32_bits() where applicable
- Moved cache_index out of struct es_em_ctxt ]
Co-developed-by: Joerg Roedel <jroedel@xxxxxxx>
Signed-off-by: Joerg Roedel <jroedel@xxxxxxx>
---
arch/x86/kernel/sev-es-shared.c | 12 ++--
arch/x86/kernel/sev-es.c | 119 +++++++++++++++++++++++++++++++-
2 files changed, 124 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/sev-es.c b/arch/x86/kernel/sev-es.c
index 03095bc7b563..0303834d4811 100644
--- a/arch/x86/kernel/sev-es.c
+++ b/arch/x86/kernel/sev-es.c
@@ -744,6 +758,91 @@ static enum es_result vc_handle_mmio(struct ghcb *ghcb,
return ret;
}

+static unsigned long sev_es_get_cpuid_cache_index(struct es_em_ctxt *ctxt)
+{
+ unsigned long hi, lo;
+
+ /* Don't attempt to cache until the xarray is initialized */
+ if (!sev_es_cpuid_cache_initialized)
+ return ULONG_MAX;
+
+ lo = lower_32_bits(ctxt->regs->ax);
+
+ /*
+ * CPUID 0x0000000d requires both RCX and XCR0, so it can't be
+ * cached.
+ */
+ if (lo == 0x0000000d)
+ return ULONG_MAX;
+
+ /*
+ * Some callers of CPUID don't always set RCX to zero for CPUID
+ * functions that don't require RCX, which can result in excessive
+ * cached values, so RCX needs to be manually zeroed for use as part
+ * of the cache index. Future CPUID values may need RCX, but since
+ * they can't be known, they must not be cached.
+ */
+ if (lo > 0x80000020)
+ return ULONG_MAX;

If the cache is shared across CPUs, do we also need to exclude function 0x1 because it contains the LAPIC ID? (Or is the cache per-CPU?)

It's currently not a per-CPU cache, but given what you pointed out, it should be if we're going to keep function 0x1 in there. The question will be how often is that CPUID issued, as that would determine if (not) caching it matters. Or even how often CPUID is issued and whether the xarray lock could be under contention if the cache is not per-CPU.

Thanks,
Tom