Re: [PATCH Part2 v5 08/45] x86/fault: Add support to handle the RMP fault for user address

From: Tom Lendacky
Date: Wed Aug 25 2021 - 09:50:23 EST


On 8/25/21 4:16 AM, Vlastimil Babka wrote:
> On 8/24/21 18:42, Joerg Roedel wrote:
>> On Mon, Aug 23, 2021 at 07:50:22AM -0700, Dave Hansen wrote:
>>> It *has* to be done in KVM, IMNHO.
>>>
>>> The core kernel really doesn't know much about SEV. It *really* doesn't
>>> know when its memory is being exposed to a virtualization architecture
>>> that doesn't know how to split TLBs like every single one before it.
>>>
>>> This essentially *must* be done at the time that the KVM code realizes
>>> that it's being asked to shove a non-splittable page mapping into the
>>> SEV hardware structures.
>>>
>>> The only other alternative is raising a signal from the fault handler
>>> when the page can't be split. That's a *LOT* nastier because it's so
>>> much later in the process.
>>>
>>> It's either that, or figure out a way to split hugetlbfs (and DAX)
>>> mappings in a failsafe way.
>>
>> Yes, I agree with that. KVM needs a check to disallow HugeTLB pages in
>> SEV-SNP guests, at least as a temporary workaround. When HugeTLBfs
>> mappings can be split into smaller pages the check can be removed.
>
> FTR, this is Sean's reply with concerns in v4:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flinux-coco%2FYPCuTiNET%252FhJHqOY%40google.com%2F&data=04%7C01%7Cthomas.lendacky%40amd.com%7C692ea2e8bfd744e7ab5d08d967a918d3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637654798234874418%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=leZuMY0%2FX8xbHA%2FOrxkXNoLCGNoVUQpY5eB3EInM55A%3D&reserved=0
>
> I think there are two main arguments there:
> - it's not KVM business to decide
> - guest may do all page state changes with 2mb granularity so it might be fine
> with hugetlb
>
> The latter might become true, but I think it's more probable that sooner
> hugetlbfs will learn to split the mappings to base pages - I know people plan to
> work on that. At that point qemu will have to recognize if the host kernel is
> the new one that can do this splitting vs older one that can't. Preferably
> without relying on kernel version number, as backports exist. Thus, trying to
> register a hugetlbfs range that either is rejected (kernel can't split) or
> passes (kernel can split) seems like a straightforward way. So I'm also in favor
> of adding that, hopefuly temporary, check.

If that's the direction taken, I think we'd be able to use a KVM_CAP_
value that can be queried by the VMM to make the determination.

Thanks,
Tom

>
> Vlastimil
>
>> Regards,
>>
>> Joerg
>>
>