Re: [PATCH v7 3/8] KVM: Make __kvm_follow_pfn not imply FOLL_GET

From: David Stevens
Date: Thu Jul 06 2023 - 02:49:56 EST


On Wed, Jul 5, 2023 at 10:19 PM Zhi Wang <zhi.wang.linux@xxxxxxxxx> wrote:
>
> On Tue, 4 Jul 2023 16:50:48 +0900
> David Stevens <stevensd@xxxxxxxxxxxx> wrote:
>
> > From: David Stevens <stevensd@xxxxxxxxxxxx>
> >
> > Make it so that __kvm_follow_pfn does not imply FOLL_GET. This allows
> > callers to resolve a gfn when the associated pfn has a valid struct page
> > that isn't being actively refcounted (e.g. tail pages of non-compound
> > higher order pages). For a caller to safely omit FOLL_GET, all usages of
> > the returned pfn must be guarded by a mmu notifier.
> >
> > This also adds a is_refcounted_page out parameter to kvm_follow_pfn that
> > is set when the returned pfn has an associated struct page with a valid
> > refcount. Callers that don't pass FOLL_GET should remember this value
> > and use it to avoid places like kvm_is_ad_tracked_page that assume a
> > non-zero refcount.
> >
> > Signed-off-by: David Stevens <stevensd@xxxxxxxxxxxx>
> > ---
> > include/linux/kvm_host.h | 10 ++++++
> > virt/kvm/kvm_main.c | 67 +++++++++++++++++++++-------------------
> > virt/kvm/pfncache.c | 2 +-
> > 3 files changed, 47 insertions(+), 32 deletions(-)
> >
> > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > index ef2763c2b12e..a45308c7d2d9 100644
> > --- a/include/linux/kvm_host.h
> > +++ b/include/linux/kvm_host.h
> > @@ -1157,6 +1157,9 @@ unsigned long gfn_to_hva_memslot_prot(struct kvm_memory_slot *slot, gfn_t gfn,
> > void kvm_release_page_clean(struct page *page);
> > void kvm_release_page_dirty(struct page *page);
> >
> > +void kvm_set_page_accessed(struct page *page);
> > +void kvm_set_page_dirty(struct page *page);
> > +
> > struct kvm_follow_pfn {
> > const struct kvm_memory_slot *slot;
> > gfn_t gfn;
> > @@ -1164,10 +1167,17 @@ struct kvm_follow_pfn {
> > bool atomic;
> > /* Allow a read fault to create a writeable mapping. */
> > bool allow_write_mapping;
> > + /*
> > + * Usage of the returned pfn will be guared by a mmu notifier. Must
> ^guarded
> > + * be true if FOLL_GET is not set.
> > + */
> > + bool guarded_by_mmu_notifier;
> >
> It seems no one sets the guraded_by_mmu_notifier in this patch. Is
> guarded_by_mmu_notifier always equal to !foll->FOLL_GET and set by the
> caller of __kvm_follow_pfn()?

Yes, this is the case.

> If yes, do we have to use FOLL_GET to resolve GFN associated with a tail page?
> It seems gup can tolerate gup_flags without FOLL_GET, but it is more like a
> temporary solution. I don't think it is a good idea to play tricks with
> a temporary solution, more like we are abusing the toleration.

I'm not sure I understand what you're getting at. This series never
calls gup without FOLL_GET.

This series aims to provide kvm_follow_pfn as a unified API on top of
gup+follow_pte. Since one of the major clients of this API uses an mmu
notifier, it makes sense to support returning a pfn without taking a
reference. And we indeed need to do that for certain types of memory.

> Is a flag like guarded_by_mmu_notifier (perhaps a better name) enough to
> indicate a tail page?

What do you mean by to indicate a tail page? Do you mean to indicate
that the returned pfn refers to non-refcounted page? That's specified
by is_refcounted_page.

-David