Re: [PATCH v3 04/27] mm/userfaultfd: Introduce special pte for unmapped file-backed mem

From: Alistair Popple
Date: Tue Jun 08 2021 - 09:18:48 EST


On Saturday, 5 June 2021 2:01:59 AM AEST Peter Xu wrote:
> On Fri, Jun 04, 2021 at 04:16:30PM +1000, Alistair Popple wrote:
> > > My understanding is that it does *not* use an additional arch-dependent
> > > bit, but puts the _PAGE_UFFD_WP bit (already set aside by any architecture
> > > implementing UFFD WP) to an additional use. That's why I called this
> > > design (from Andrea) more elegant than mine (swap type business).
> >
> > Oh my bad, I had somehow missed this was reusing an *existing* arch-dependent
> > swap bit (_PAGE_SWP_UFFD_WP, although the same argument could apply) even
> > though it's in the commit message. Obviously I should have read that more
> > carefully, apologies for the noise but thanks for the clarification.
>
> Right, as Hugh mentioned what this series wanted to use is one explicit pte
> that no one should ever be using, so ideally that should be the most saving way
> per address-space pov.
>
> Meanwhile I think that pte can actually be not related to _PAGE_UFFD_WP at all,
> as long as it's a specific pte value then it will service the same goal (even
> if to reuse a new swp type, I'll probably only use one pte for it and leave the
> rest for other use; but who knows who will start to use the rest!).
>
> I kept using it because that's suggested by Andrea (it actually has
> type==off==0 as Hugh suggested too - so it keeps a suggestion of both!) and
> it's a good idea to use it since (1) it's never used by anyone before, and (2)
> it is _somehow_ related to uffd-wp itself already by having that specific bit
> set in the special pte, while that's also the only bit set for the u64 field.
>
> It looks very nice too when debug, because when I dump the ptes it reads 0x4 on
> x86.. so the pte value is even easy to read as a number. :)
>
> However I can see that it is less easy to follow than the swap type solution.
> In all cases it's still something worth thinking about before using up the swap
> types - it's not so rich there, and we keep shrinking MAX_SWAPFILES.. so let's
> see whether uffd-wp could be the 1st one to open a new field for unused
> "invalid/swap pte" address space.

Agreed, that matches with what I was thinking as well. If we do end up having
more swap types such as this which don't need to store much information in
the swap pte itself we could define a special swap type (eg. this bit) for
that.

> Meanwhile, I did have a look at ARM on supporting uffd-wp in general, starting
> from anonymous pages. I doubt whether it can be done for old arms (uffd-wp not
> even supported on 32bit x86 after all), but for ARM64 I see it has:
>
> For normal ptes:
>
> /*
> * Level 3 descriptor (PTE).
> */
> #define PTE_VALID (_AT(pteval_t, 1) << 0)
> #define PTE_TYPE_MASK (_AT(pteval_t, 3) << 0)
> #define PTE_TYPE_PAGE (_AT(pteval_t, 3) << 0)
> #define PTE_TABLE_BIT (_AT(pteval_t, 1) << 1)
> #define PTE_USER (_AT(pteval_t, 1) << 6) /* AP[1] */
> #define PTE_RDONLY (_AT(pteval_t, 1) << 7) /* AP[2] */
> #define PTE_SHARED (_AT(pteval_t, 3) << 8) /* SH[1:0], inner shareable */
> #define PTE_AF (_AT(pteval_t, 1) << 10) /* Access Flag */
> #define PTE_NG (_AT(pteval_t, 1) << 11) /* nG */
> #define PTE_GP (_AT(pteval_t, 1) << 50) /* BTI guarded */
> #define PTE_DBM (_AT(pteval_t, 1) << 51) /* Dirty Bit Management */
> #define PTE_CONT (_AT(pteval_t, 1) << 52) /* Contiguous range */
> #define PTE_PXN (_AT(pteval_t, 1) << 53) /* Privileged XN */
> #define PTE_UXN (_AT(pteval_t, 1) << 54) /* User XN */
>
> For swap ptes:
>
> /*
> * Encode and decode a swap entry:
> * bits 0-1: present (must be zero)
> * bits 2-7: swap type
> * bits 8-57: swap offset
> * bit 58: PTE_PROT_NONE (must be zero)
> */
>
> So I feel like we still have chance there at least for 64bit ARM? As both
> normal/swap ptes have some bits free (bits 2-5,9 for normal ptes; bits 59-63
> for swap ptes). But as I know little on ARM64, I hope I looked at the right
> things..

I don't claim to be an expert there either. Given there's already a bit
defined for x86 anyway (which is what I missed) I now think the special
swap idea is ok, although I still need to look at the rest of the series.

> Thanks,
>
> --
> Peter Xu
>