Re: [RFC V1 5/5] x86: CVMs: Ensure that memory conversions happen at 2M alignment

From: Jeremi Piotrowski
Date: Fri Feb 02 2024 - 03:01:38 EST


On 02/02/2024 06:08, Vishal Annapurve wrote:
> On Thu, Feb 1, 2024 at 5:32 PM Jeremi Piotrowski
> <jpiotrowski@xxxxxxxxxxxxxxxxxxx> wrote:
>>
>> On 01/02/2024 04:46, Vishal Annapurve wrote:
>>> On Wed, Jan 31, 2024 at 10:03 PM Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
>>>>
>>>> On 1/11/24 21:52, Vishal Annapurve wrote:
>>>>> @@ -2133,8 +2133,10 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
>>>>> int ret;
>>>>>
>>>>> /* Should not be working on unaligned addresses */
>>>>> - if (WARN_ONCE(addr & ~PAGE_MASK, "misaligned address: %#lx\n", addr))
>>>>> - addr &= PAGE_MASK;
>>>>> + if (WARN_ONCE(addr & ~HPAGE_MASK, "misaligned address: %#lx\n", addr)
>>>>> + || WARN_ONCE((numpages << PAGE_SHIFT) & ~HPAGE_MASK,
>>>>> + "misaligned numpages: %#lx\n", numpages))
>>>>> + return -EINVAL;
>>>>
>>>> This series is talking about swiotlb and DMA, then this applies a
>>>> restriction to what I *thought* was a much more generic function:
>>>> __set_memory_enc_pgtable(). What prevents this function from getting
>>>> used on 4k mappings?
>>>>
>>>>
>>>
>>> The end goal here is to limit the conversion granularity to hugepage
>>> sizes. SWIOTLB allocations are the major source of unaligned
>>> allocations(and so the conversions) that need to be fixed before
>>> achieving this goal.
>>>
>>> This change will ensure that conversion fails for unaligned ranges, as
>>> I don't foresee the need for 4K aligned conversions apart from DMA
>>> allocations.
>>
>> Hi Vishal,
>>
>> This assumption is wrong. set_memory_decrypted is called from various
>> parts of the kernel: kexec, sev-guest, kvmclock, hyperv code. These conversions
>> are for non-DMA allocations that need to be done at 4KB granularity
>> because the data structures in question are page sized.
>>
>> Thanks,
>> Jeremi
>
> Thanks Jeremi for pointing out these usecases.
>
> My brief analysis for these call sites:
> 1) machine_kexec_64.c, realmode/init.c, kvm/mmu/mmu.c - shared memory
> allocation/conversion happens when host side memory encryption
> (CC_ATTR_HOST_MEM_ENCRYPT) is enabled.
> 2) kernel/kvmclock.c - Shared memory allocation can be made to align
> 2M even if the memory needed is lesser.
> 3) drivers/virt/coco/sev-guest/sev-guest.c,
> drivers/virt/coco/tdx-guest/tdx-guest.c - Shared memory allocation can
> be made to align 2M even if the memory needed is lesser.
>
> I admit I haven't analyzed hyperv code in context of these changes,
> but will take a better look to see if the calls for memory conversion
> here can fit the category of "Shared memory allocation can be made to
> align 2M even if the memory needed is lesser".
>
> Agree that this patch should be modified to look something like
> (subject to more changes on the call sites)

No, this patch is still built on the wrong assumptions. You're trying
to alter a generic function in the guest for the constraints of a very
specific hypervisor + host userspace + memory backend combination.
That's not right.

Is the numpages check supposed to ensure that the guest *only* toggles
visibility in chunks of 2MB? Then you're exposing more memory to the host
than the guest intends.

If you must - focus on getting swiotlb conversions to happen at the desired
granularity but don't try to force every single conversion to be >4K.

Thanks,
Jeremi


>
> =============
> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
> index e9b448d1b1b7..8c608d6913c4 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -2132,10 +2132,15 @@ static int __set_memory_enc_pgtable(unsigned
> long addr, int numpages, bool enc)
> struct cpa_data cpa;
> int ret;
>
> /* Should not be working on unaligned addresses */
> if (WARN_ONCE(addr & ~PAGE_MASK, "misaligned address: %#lx\n", addr))
> addr &= PAGE_MASK;
>
> + if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) &&
> + (WARN_ONCE(addr & ~HPAGE_MASK, "misaligned address:
> %#lx\n", addr)
> + || WARN_ONCE((numpages << PAGE_SHIFT) & ~HPAGE_MASK,
> + "misaligned numpages: %#lx\n", numpages)))
> + return -EINVAL;
> +
> memset(&cpa, 0, sizeof(cpa));
> cpa.vaddr = &addr;
> cpa.numpages = numpages;