Re: [PATCH 1/2] KVM: MMU: Do not treat ZONE_DEVICE pages as being reserved

From: Dan Williams
Date: Thu Nov 07 2019 - 10:36:59 EST


On Thu, Nov 7, 2019 at 3:12 AM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
>
> On 07/11/19 06:48, Dan Williams wrote:
> >> How do mmu notifiers get held off by page references and does that
> >> machinery work with ZONE_DEVICE? Why is this not a concern for the
> >> VM_IO and VM_PFNMAP case?
> > Put another way, I see no protection against truncate/invalidate
> > afforded by a page pin. If you need guarantees that the page remains
> > valid in the VMA until KVM can install a mmu notifier that needs to
> > happen under the mmap_sem as far as I can see. Otherwise gup just
> > weakly asserts "this pinned page was valid in this vma at one point in
> > time".
>
> The MMU notifier is installed before gup, so any invalidation will be
> preceded by a call to the MMU notifier. In turn,
> invalidate_range_start/end is called with mmap_sem held so there should
> be no race.
>
> However, as Sean mentioned, early put_page of ZONE_DEVICE pages would be
> racy, because we need to keep the reference between the gup and the last
> time we use the corresponding struct page.

If KVM is establishing the mmu_notifier before gup then there is
nothing left to do with that ZONE_DEVICE page, so I'm struggling to
see what further qualification of kvm_is_reserved_pfn() buys the
implementation.

However, if you're attracted to the explicitness of Sean's approach
can I at least ask for comments asserting that KVM knows it already
holds a reference on that page so the is_zone_device_page() usage is
safe?

David and I are otherwise trying to reduce is_zone_device_page() to
easy to audit "obviously safe" cases and converting the others with
additional synchronization.