Re: [PATCH Part2 RFC v2 10/37] x86/fault: Add support to handle the RMP fault for kernel address
From: Brijesh Singh
Date: Mon May 03 2021 - 11:37:46 EST
Hi Dave,
On 5/3/21 9:44 AM, Dave Hansen wrote:
> On 4/30/21 5:37 AM, Brijesh Singh wrote:
>> When SEV-SNP is enabled globally, a write from the host goes through the
>> RMP check. When the host writes to pages, hardware checks the following
>> conditions at the end of page walk:
>>
>> 1. Assigned bit in the RMP table is zero (i.e page is shared).
>> 2. If the page table entry that gives the sPA indicates that the target
>> page size is a large page, then all RMP entries for the 4KB
>> constituting pages of the target must have the assigned bit 0.
>> 3. Immutable bit in the RMP table is not zero.
>>
>> The hardware will raise page fault if one of the above conditions is not
>> met. A host should not encounter the RMP fault in normal execution, but
>> a malicious guest could trick the hypervisor into it. e.g., a guest does
>> not make the GHCB page shared, on #VMGEXIT, the hypervisor will attempt
>> to write to GHCB page.
> Is that the only case which is left? If so, why don't you simply split
> the direct map for GHCB pages before giving them to the guest? Or, map
> them with vmap() so that the mapping is always 4k?
GHCB was just an example. Another example is a vfio driver accessing the
shared page. If those pages are not marked shared then kernel access
will cause an RMP fault. Ideally we should not be running into this
situation, but if we do, then I am trying to see how best we can avoid
the host crashes.
Another reason for having this is to catch the hypervisor bug, during
the SNP guest create, the KVM allocates few backing pages and sets the
assigned bit for it (the examples are VMSA, and firmware context page).
If hypervisor accidentally free's these pages without clearing the
assigned bit in the RMP table then it will result in RMP fault and thus
a kernel crash.
>
> Or, worst case, you could use exception tables and something like
> copy_to_user() to write to the GHCB. That way, the thread doing the
> write can safely recover from the fault without the instruction actually
> ever finishing execution.
>
> BTW, I went looking through the spec. I didn't see anything about the
> guest being able to write the "Assigned" RMP bit. Did I miss that?
> Which of the above three conditions is triggered by the guest failing to
> make the GHCB page shared?
The GHCB spec section "Page State Change" provides an interface for the
guest to request the page state change. During bootup, the guest uses
the Page State Change VMGEXIT to request hypervisor to make the page
shared. The hypervisor uses the RMPUPDATE instruction to write to
"assigned" bit in the RMP table.
On VMGEXIT, the very first thing which vmgexit handler does is to map
the GHCB page for the access and then later using the copy_to_user() to
sync the GHCB updates from hypervisor to guest. The copy_to_user() will
cause a RMP fault if the GHCB is not mapped shared. As I explained
above, GHCB page was just an example, vfio or other may also get into
this situation.
-Brijesh