Re: [PATCH v2 4/8] x86/sev: Enable PVALIDATE for PFNs without a valid virtual address

From: Edgecombe, Rick P
Date: Tue Nov 28 2023 - 13:59:40 EST


On Tue, 2023-11-28 at 18:08 +0000, Michael Kelley wrote:
> >
> > Sort of separately, if those vmalloc objections can't be worked
> > through, did you consider doing something like text_poke() does
> > (create
> > the temporary mapping in a temporary MM) for pvalidate purposes? I
> > don't know enough about what kind of special exceptions might popup
> > during that operation though, might be playing with fire...
>
> Interesting idea.  But from a quick glance at the text_poke() code,
> such an approach seems somewhat complex, and I suspect it will have
> the same perf issues (or worse) as creating a new vmalloc area for
> each PVALIDATE invocation.

Using new vmalloc area's will eventually result in a kernel shootdown,
but usually have no flushes. text_poke will always result in a local-
only flush. So at least whatever slowdown there is would only affect
the calling thread.

As for complexity, I think it might be simple to implement actually.
What kind of special exceptions could come out of pvalidate, I'm not so
sure. But the kernel terminates the VM on failure anyway, so maybe it's
not an issue?

diff --git a/arch/x86/kernel/alternative.c
b/arch/x86/kernel/alternative.c
index 73be3931e4f0..a13293564eeb 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1905,6 +1905,16 @@ void *text_poke(void *addr, const void *opcode,
size_t len)
return __text_poke(text_poke_memcpy, addr, opcode, len);
}

+static void text_poke_pvalidate(void *dst, const void *src, size_t
len)
+{
+ pvalidate(dst, len, true); // if fail, terminate
+}
+
+void *pvalidated_poke(void *addr)
+{
+ return __text_poke(text_poke_pvalidate, addr, NULL, PAGE_SIZE);
+}
+
/**
* text_poke_kgdb - Update instructions on a live kernel by kgdb
* @addr: address to modify



>
> At this point, the complexity of creating the temp mapping for
> PVALIDATE is seeming excessive.  On balance it seems simpler to
> revert to an approach where the use of set_memory_np() and
> set_memory_p() is conditional.  It would be necessary when #VC
> and #VE exceptions are directed to a paravisor.  (This assumes the
> paravisor interface in the hypervisor callbacks does the natural
> thing
> of working with physical addresses, so there's no need for a temp
> mapping.)
>
> Optionally, the set_memory_np()/set_memory_p() approach could
> be used in other cases where the hypervisor callbacks work with
> physical addresses.  But it can't be used with cases where the
> hypervisor
> callbacks need valid virtual addresses.
>
> So on net, set_memory_np()/set_memory_p() would be used in
> the Hyper-V cases of TDX and SEV-SNP with a paravisor.   It could
> optionally be used with TDX with no paravisor, but my sense is
> that Kirill wants to keep TDX "as is" and let the exception handlers
> do the load_unaligned_zeropad() fixup.
>
> It could not be used with SEV-SNP with no paravisor.   Additional
> fixes
> may be needed on the SEV-SNP side to properly fixup
> load_unaligned_zeropad() accesses to a page that's in transition
> between encrypted and decrypted.
>

Yea, I don't know about this paravisor/exception stuff.