Re: [PATCH v5 00/26] userfaultfd-wp: Support shmem and hugetlbfs

From: Peter Xu
Date: Thu Jul 22 2021 - 14:30:46 EST


On Thu, Jul 15, 2021 at 04:13:56PM -0400, Peter Xu wrote:
> About Swap Special PTE
> ======================

I've got some more feedback regarding this series, either within review comment
or from other threads.

Hugh shared his concern on using such type of pte level operation could make
things even worse:

https://lore.kernel.org/linux-mm/796cbb7-5a1c-1ba0-dde5-479aba8224f2@xxxxxxxxxx/

Since most context is irrelevant, only quotting the p.s. section:

p.s. Peter, unrelated to this particular bug, and should not divert from
fixing it: but looking again at those swap encodings, and particularly the
soft_dirty manipulations: they look very fragile. I think uffd_wp was wrong
to follow that bad example, and your upcoming new encoding (that I have
previously called elegant) takes it a worse step further.

Alistair shared his preference on keep using swp_entry to store these extra
information:

https://lore.kernel.org/linux-mm/5071185.SEdLSG93TQ@nvdebian/

So I'm trying to do some self introspection to see maybe I was just too bold to
try introducing that pte idea, either I'm not the "suitable one" to introduce
it as it's indeed challenging, or maybe it's as simple as we don't really need
to worry using up swap address space yet, at least for now (currently worst
case MAX_SWAPFILES=32-4-2-1=25).

I don't yet have plan to think about Hugh's idea on further dropping the usage
of per-arch bits in swap ptes, e.g. _PAGE_SWP_SOFT_DIRTY or _PAGE_SWP_UFFD_WP.
I need more thoughts there. But what I can still do is think about whether we
can still go back to swap entry ptes for this series.

Originally I was afraid of wasting a whole type of swp entry just for one
single pte, so we came up with the idea (thanks again for Andrea and Hugh on
proposing and discussions around it!). But did we just worry too much on that
while it comes from nothing?

So as time passes, there're indeed some more similar requirements coming that
has issues that look like what uffd-wp file-backed wanted to solve on pagemap,
they're:

- PM_SWAP info missing when shmem page swapped out
- PM_SOFT_DIRTY lost when shmem page swapped out

The 1st issue might be solved by other way and there're still discussed here:

https://lore.kernel.org/linux-mm/YPmX7ZyDFRCuLXrh@t490s/

I don't see a good way to solve the 2nd issue (if we would like to solve it
first, though; I don't know whether that's intended to not be fixed for some
reason), if without similar solution like what we will like to apply to
maintain the uffd-wp bit, because they're all potentially issues around
persisting pte information for file-backed memories.

These requirements at least show that even if we introduce a new swp type
(maybe let's just call it SWP_PTE_MARKER) then uffd-wp won't be the only user,
so there're already potential users of more bit out of the entry.

In summary, I'm considering whether I should switch the special swap pte idea
back to the swp entry idea (safer, according to Hugh, also arch-independent,
according to Alistair). Before working on that, any early comment would be
greatly welcomed.

Thanks.

--
Peter Xu