Re: [PATCH v3 1/1] KVM: s390: pv: fix race when making a page secure

From: Claudio Imbrenda
Date: Tue Mar 04 2025 - 04:22:13 EST


On Fri, 28 Feb 2025 22:15:04 +0100
David Hildenbrand <david@xxxxxxxxxx> wrote:

> On 27.02.25 14:09, Claudio Imbrenda wrote:
> > Holding the pte lock for the page that is being converted to secure is
> > needed to avoid races. A previous commit removed the locking, which
> > caused issues. Fix by locking the pte again.
> >
> > Fixes: 5cbe24350b7d ("KVM: s390: move pv gmap functions into kvm")
> > Reported-by: David Hildenbrand <david@xxxxxxxxxx>
> > Signed-off-by: Claudio Imbrenda <imbrenda@xxxxxxxxxxxxx>
>
> Tested with shmem / memory-backend-memfd that ends up using large folios
> / THPs.
>
> Tested-by: David Hildenbrand <david@xxxxxxxxxx>
> Reviewed-by: David Hildenbrand <david@xxxxxxxxxx>
>
> Two comments below.

I will need to send a v4, unfortunately there are other issues with this
patch (as you have probably noticed by now as well)

>
> [...]
>
> > +
> > +int make_hva_secure(struct mm_struct *mm, unsigned long hva, struct uv_cb_header *uvcb)
> > +{
> > + struct folio *folio;
> > + spinlock_t *ptelock;
> > + pte_t *ptep;
> > + int rc;
> > +
> > + ptep = get_locked_valid_pte(mm, hva, &ptelock);
> > + if (!ptep)
> > + return -ENXIO;
> > +
> > + folio = page_folio(pte_page(*ptep));
> > + folio_get(folio);
>
> Grabbing a folio reference is only required if you want to keep using
> the folio after the pte_unmap_unlock. While the PTL is locked it cannot
> vanish.
>
> So consider grabbing a reference only before dropping the PTL and you
> inted to call kvm_s390_wiggle_split_folio(). Then, you would effectively
> not require these two atomics on the expected hot path.
>
> (I recall that the old code did that)

This code will go away hopefully in the next merge window anyway
(unless I get sick *again*)

>
> > + /*
> > + * Secure pages cannot be huge and userspace should not combine both.
> > + * In case userspace does it anyway this will result in an -EFAULT for
> > + * the unpack. The guest is thus never reaching secure mode.
> > + * If userspace plays dirty tricks and decides to map huge pages at a
> > + * later point in time, it will receive a segmentation fault or
> > + * KVM_RUN will return -EFAULT.
> > + */
> > + if (folio_test_hugetlb(folio))
> > + rc = -EFAULT;
> > + else if (folio_test_large(folio))
> > + rc = -E2BIG;
> > + else if (!pte_write(*ptep))
> > + rc = -ENXIO;
> > + else
> > + rc = make_folio_secure(mm, folio, uvcb);
> > + pte_unmap_unlock(ptep, ptelock);
> > +
> > + if (rc == -E2BIG || rc == -EBUSY)
> > + rc = kvm_s390_wiggle_split_folio(mm, folio, rc == -E2BIG);
> > + folio_put(folio);
> > +
> > + return rc;
> > +}
> > +EXPORT_SYMBOL_GPL(make_hva_secure);
> >
> > /*
> > * To be called with the folio locked or with an extra reference! This will
> > diff --git a/arch/s390/kvm/gmap.c b/arch/s390/kvm/gmap.c
> > index 02adf151d4de..c08950b4301c 100644
>
>
> There is one remaining reference to __gmap_make_secure, which you remove:
>
> $ git grep __gmap_make_secure
> arch/s390/kvm/gmap.c: * Return: 0 on success, < 0 in case of error (see
> __gmap_make_secure()).

will fix

>
>
>