Re: [RFC PATCH 8/8] kvm: gmem: Allow restricted userspace mappings

From: Patrick Roy
Date: Wed Jul 10 2024 - 05:52:02 EST




On 7/9/24 22:13, David Hildenbrand wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On 09.07.24 16:48, Fuad Tabba wrote:
>> Hi Patrick,
>>
>> On Tue, Jul 9, 2024 at 2:21 PM Patrick Roy <roypat@xxxxxxxxxxxx> wrote:
>>>
>>> Allow mapping guest_memfd into userspace. Since AS_INACCESSIBLE is set
>>> on the underlying address_space struct, no GUP of guest_memfd will be
>>> possible.
>>
>> This patch allows mapping guest_memfd() unconditionally. Even if it's
>> not guppable, there are other reasons why you wouldn't want to allow
>> this. Maybe a config flag to gate it? e.g.,
>
>
> As discussed with Jason, maybe not the direction we want to take with
> guest_memfd.
> If it's private memory, it shall not be mapped. Also not via magic
> config options.
>
> We'll likely discuss some of that in the meeting MM tomorrow I guess
> (having both shared and private memory in guest_memfd).

Oh, nice. I'm assuming you mean this meeting:
https://lore.kernel.org/linux-mm/197a2f19-c71c-fbde-a62a-213dede1f4fd@xxxxxxxxxx/T/?
Would it be okay if I also attend? I see it also mentions huge pages,
which is another thing we are interested in, actually :)

> Note that just from staring at this commit, I don't understand the
> motivation *why* we would want to do that.

Fair - I admittedly didn't get into that as much as I probably should
have. In our usecase, we do not have anything that pKVM would (I think)
call "guest-private" memory. I think our memory can be better described
as guest-owned, but always shared with the VMM (e.g. userspace), but
ideally never shared with the host kernel. This model lets us do a lot
of simplifying assumptions: Things like I/O can be handled in userspace
without the guest explicitly sharing I/O buffers (which is not exactly
what we would want long-term anyway, as sharing in the guest_memfd
context means sharing with the host kernel), we can easily do VM
snapshotting without needing things like TDX's TDH.EXPORT.MEM APIs, etc.

> --
> Cheers,
>
> David / dhildenb
>