Re: [PATCH] mm/userfaultfd: selftests: Fix memory corruption with thp enabled

From: Peter Xu
Date: Mon Sep 27 2021 - 16:37:22 EST

On Mon, Sep 27, 2021 at 10:49:39AM -0700, Axel Rasmussen wrote:
> One small comment:
> I'd prefer to keep the "uffd_test_ops->release_pages(area_src);"
> above, to ensure the src region is empty. It's not immediately obvious
> to me that we overwrite *all* of the bytes in src when we initialize
> it. (I'd have to go look at the definition of area_count and read the
> loop carefully.) It may not be technically needed, but it makes the
> guarantee that we're starting with a clean slate, free from any
> changes from previous test cases, very clear + explicit.

I think there're only two fields used, area_mutex and area_count. I'm not sure
what's the initial idea from Andrea when the test case is merged, but IMHO it
can be written as a struct too instead of using the long macros; struct could
make it easier to undertand.

And note again that we have your uffd_test_ctx_clear() called which contains
munmap() of all the buffers before the release_pages() calls. It means at
least for anon and shmem the pages won't be there 100% sure to me. hugetlbfs
is the only one that may still keep the pages as the fs should hold another
refcount on the inode, however as all the two fields got re-written anyway, so
I think it'll be still very safe to drop the two release_pages().

> Moving the release_pages(area_dst) down as you've done seems correct to me.
> Either way you can take:
> Reviewed-by: Axel Rasmussen <axelrasmussen@xxxxxxxxxx>
> >
> > It's just that after the weekend when I look back I still don't see a 100%
> > clean way to fix it yet. Mapping 4K PROT_NONE before/after each allocation is
> > the most ideal but still looks tricky to me.
> >
> > Would you have time on looking for a better solution, so as to (see it a way
> > to) complete what commit 8ba6e8640844 whats to do afterwards?
> Sure, it seems related to the other THP investigations we talked about
> in the other thread, so I'm happy to look into it.
> Just to set expectations, progress may be slightly slow as I'm
> balancing other work my employer wants done, and some upcoming time
> off. But, I think with your patch the test is at least stable (not
> flaky) enough that there is no *urgent* need for this, so it should be
> fine.

Thanks a lot on both reviewing the patch and willing to look into it. As long
as we don't get any report for khugepaged (if it happens, I'll provide the
PROT_NONE hack instead - that'll work 100% I believe but less clean; but for
now IMHO we don't need to bother) then we don't need to rush on that.

Peter Xu