Re: [PATCH v8 01/23] mm: Introduce PTE_MARKER swap entry

From: Peter Xu
Date: Wed Apr 13 2022 - 09:44:34 EST


On Wed, Apr 13, 2022 at 10:30:33AM +1000, Alistair Popple wrote:
> Peter Xu <peterx@xxxxxxxxxx> writes:
>
> > On Tue, Apr 12, 2022 at 11:07:56AM +1000, Alistair Popple wrote:
> >> Hi Peter,
> >
> > Hi, Alistair,
> >
> >>
> >> I noticed this while reviewing the next patch in the series. I think you need to
> >> add CONFIG_PTE_MARKER to the below as well:
> >>
> >> #if defined(CONFIG_MEMORY_FAILURE) || defined(CONFIG_MIGRATION) || \
> >> defined(CONFIG_DEVICE_PRIVATE)
> >> static inline int non_swap_entry(swp_entry_t entry)
> >> {
> >> return swp_type(entry) >= MAX_SWAPFILES;
> >> }
> >> #else
> >> static inline int non_swap_entry(swp_entry_t entry)
> >> {
> >> return 0;
> >> }
> >> #endif
> >>
> >> Otherwise marker entries will be treated as swap entries, which is wrong for
> >> example in swapin_walk_pmd_entry() as marker entries are no longer considered
> >> pte_none().
> >
> > Thanks for the comment, that makes sense.
> >
> > Instead of adding PTE_MARKER into this equation, I'm going backward and
> > wondering purely on why we need to bother with non_swap_entry() at all if
> > MAX_SWAPFILES is already defined with proper knowledges of all these bits.
>
> I was going to suggest it was to help the compiler optimise the non-swap entry
> code away. But I just tested and it makes no difference in .text section size
> either way so I think your suggestion is good unless that isn't true for other
> architecture/compiler combinations (I only tried gcc-10.2.1 and x86_64).
>
> That's a possibility because the optimisation isn't obvious to me at least.
>
> non_swap_entry() is equivalent to:
>
> (entry.val >> SWP_TYPE_SHIFT) >= MAX_SWAPFILES;
> (entry.val >> (BITS_PER_XA_VALUE - MAX_SWAPFILES_SHIFT)) >= (1<<5);
> (entry.val >> (BITS_PER_LONG - 1 - 5)) >= (1<<5);
> (entry.val >> 58) >= (1<<5);
>
> Where entry.val is a long. So from that alone it's not obvious this could be
> optimised away, because nothing there implies entry.val != (1<<63) which would
> make the conditional true. But there's a lot of inlining going on in the
> creation of swap entries which I didn't trace, so something must end up implying
> entry.val < (1<<63).

I think my point was that we check non_swap_entry() with a pre-assumption
that it's a swap entry, then it means it's the slow path already after
we've parsed the pte entry and be aware it's not present.

So I'm doubting how much the optimization (even if at last, applicable)
could help in reality, not to mention that it'll only have an effect when
all of the configs are not set, so it's a micro optimization on slow path
in a rare config setup.

For any sane modern hosts I'd expect CONFIG_MIGRATION should at least be
set.. then it invalidates any potential optimization we're discussing here.

Let me post a patch for this and move the discussion there. Thanks a lot
for looking into it.

--
Peter Xu