Re: [PATCH v2 2/5] x86/kexec: do unconditional WBINVD in relocate_kernel()

From: Tom Lendacky
Date: Thu Mar 21 2024 - 17:02:31 EST


On 3/20/24 18:10, Kirill A. Shutemov wrote:
On Thu, Mar 21, 2024 at 09:48:28AM +1300, Huang, Kai wrote:

Hi Tom,

I am not aware of kexec() support status for SEV-ES/SEV-SNP guests.
Does patch 1 break them?

SNP guests can kexec with some patches that are currently in process
around shared to private memory conversions. ES guests can only kexec
with a single vCPU. There was a recent patch series to add support for
multiple vCPUs.

Patch #1 doesn't break either ES or SNP because we still have an IDT and
traditional kernel addressing in place, so the #VC can be handled.

How about plain SEV guest?


Whereas patch #2 has switched to identity mapping and removed the IDT,
so a #VC causes a triple fault.

That makes sense. Thanks.

Hi Kirill,

Does TDX guest have similar behaviour -- that WBINVD in stop_this_cpu() can
be handled although it causes #VE, while WBINVD in relocate_kernel() will
just triple fault the guest?

No. We never handle WBINVD #VE. Guest cannot handle WBINVD itself and the
only option is to ask host to do this. We cannot guarantee host will do

Is the WBINVD performed or ignored in that case?

anything useful with the request. I guess it can be potential attack
vector if host strategically ignores WBINVD to induce bad guest behaviour.

With SNP, memory is coherent so there isn't a need for a WBINVD within a guest and so issuing it should not be an issue whether the hypervisor performs the operation or not. I don't know what can happen in the case where, say, you have a non-coherent TDISP device attached or such, but that would be very unusual/unlikely.


And it is not good from host PoV either. If it does WBINVD on every guest
request we get guest->host DoS attack possibility.

Yeah, that can happen today, regardless of the type of VM running.


Tom, I am curious, how do you deal with these problems?

If the WBINVD is being intercepted, then it will generate a #VC and we use the GHCB protocol to communicate that back to the hypervisor to handle.

Thanks,
Tom