On Wed, Mar 13, 2024, Christian König wrote:
Am 13.03.24 um 05:55 schrieb David Stevens:No, KVM doesn't assume that.
On Thu, Feb 29, 2024 at 10:36 PM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:Well, the meaning of VM_PFNMAP is that you should not touch the underlying
On Thu, Feb 29, 2024 at 11:57:51AM +0900, David Stevens wrote:Patches to amdgpu to have been NAKed [1] with the justification that
Our use case is virtio-gpu blob resources [1], which directly map host.. and just as last time around that is still the problem that needs
graphics buffers into the guest as "vram" for the virtio-gpu device.
This feature currently does not work on systems using the amdgpu driver,
as that driver allocates non-compound higher order pages via
ttm_pool_alloc_page().
to be fixed instead of creating a monster like this to map
non-refcounted pages.
using non-refcounted pages is working as intended and KVM is in the
wrong for wanting to take references to pages mapped with VM_PFNMAP
[2].
The existence of the VM_PFNMAP implies that the existence of
non-refcounted pages is working as designed. We can argue about
whether or not VM_PFNMAP should exist, but until VM_PFNMAP is removed,
KVM should be able to handle it. Also note that this is not adding a
new source of non-refcounted pages, so it doesn't make removing
non-refcounted pages more difficult, if the kernel does decide to go
in that direction.
struct page the PTE is pointing to. As far as I can see this includes
grabbing a reference count.
But that isn't really the problem here. The issue is rather that KVM assumes
that by grabbing a reference count to the page that the driver won't change
the PTE to point somewhere else.. And that is simply not true.
So what KVM needs to do is to either have an MMU notifier installed so thatKVM already has an MMU notifier and reacts accordingly.
updates to the PTEs on the host side are reflected immediately to the PTEs
on the guest side.
Or (even better) you use hardware functionality like nested page tables soThat's not how stage-2 page tables work.
that we don't actually need to update the guest PTEs when the host PTEs
change.
And when you have either of those two functionalities the requirement to addThe KVM issue that this series is solving isn't that KVM grabs a reference, it's
a long term reference to the struct page goes away completely. So when this
is done right you don't need to grab a reference in the first place.
that KVM assumes that any non-reserved pfn that is backed by "struct page" is
refcounted.
What Christoph is objecting to is that, in this series, KVM is explicitly adding
support for mapping non-compound (huge)pages into KVM guests. David is arguing
that Christoph's objection to _KVM_ adding support is unfair, because the real
problem is that the kernel already maps such pages into host userspace. I.e. if
the userspace mapping ceases to exist, then there are no mappings for KVM to follow
and propagate to KVM's stage-2 page tables.