Re: [PATCH 3/5] mm: Rework return value for copy_one_pte()

From: Peter Xu
Date: Tue Sep 22 2020 - 11:29:39 EST


On Tue, Sep 22, 2020 at 12:11:29AM -0700, John Hubbard wrote:
> On 9/21/20 2:17 PM, Peter Xu wrote:
> > There's one special path for copy_one_pte() with swap entries, in which
> > add_swap_count_continuation(GFP_ATOMIC) might fail. In that case we'll return
>
> I might be looking at the wrong place, but the existing code seems to call
> add_swap_count_continuation(GFP_KERNEL), not with GFP_ATOMIC?

Ah, I wanted to reference the one in swap_duplicate().

>
> > the swp_entry_t so that the caller will release the locks and redo the same
> > thing with GFP_KERNEL.
> >
> > It's confusing when copy_one_pte() must return a swp_entry_t (even if all the
> > ptes are non-swap entries). More importantly, we face other requirement to
> > extend this "we need to do something else, but without the locks held" case.
> >
> > Rework the return value into something easier to understand, as defined in enum
> > copy_mm_ret. We'll pass the swp_entry_t back using the newly introduced union
>
> I like the documentation here, but it doesn't match what you did in the patch.
> Actually, the documentation had the right idea (enum, rather than #define, for
> COPY_MM_* items). Below...

Yeah actually my very initial version has it as an enum, then I changed it to
macros because I started to want it return negative as errors. However funnily
in the current version copy_one_pte() won't return an error anymore... So
probably, yes, it should be a good idea to get the enum back.

Also we should be able to drop the negative handling too with copy_ret, though
it should be in the next patch.

>
> > copy_mm_data parameter.
> >
> > Another trivial change is to move the reset of the "progress" counter into the
> > retry path, so that we'll reset it for other reasons too.
> >
> > This should prepare us with adding new return codes, very soon.
> >
> > Signed-off-by: Peter Xu <peterx@xxxxxxxxxx>
> > ---
> > mm/memory.c | 42 +++++++++++++++++++++++++++++-------------
> > 1 file changed, 29 insertions(+), 13 deletions(-)
> >
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 7525147908c4..1530bb1070f4 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -689,16 +689,24 @@ struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr,
> > }
> > #endif
> > +#define COPY_MM_DONE 0
> > +#define COPY_MM_SWAP_CONT 1
>
> Those should be enums, so as to get a little type safety and other goodness from
> using non-macro items.
>
> ...
> > @@ -866,13 +877,18 @@ static int copy_pte_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
> > pte_unmap_unlock(orig_dst_pte, dst_ptl);
> > cond_resched();
> > - if (entry.val) {
> > - if (add_swap_count_continuation(entry, GFP_KERNEL) < 0)
> > + switch (copy_ret) {
> > + case COPY_MM_SWAP_CONT:
> > + if (add_swap_count_continuation(data.entry, GFP_KERNEL) < 0)
> > return -ENOMEM;
> > - progress = 0;
>
> Yes. Definitely a little cleaner to reset this above, instead of here.
>
> > + break;
> > + default:
> > + break;
>
> I assume this no-op noise is to placate the compiler and/or static checkers. :)

This is (so far) for COPY_MM_DONE. I normally will cover all cases in a
"switch()" and here "default" is for it. Even if I covered all the
possibilities, I may still tend to keep one "default" and a WARN_ON_ONCE(1) to
make sure nothing I've missed. Not sure whether that's the ideal way, though.

Thanks,

--
Peter Xu