Re: [PATCH v1] mm/gup: adjust stale comment for RCU GUP-fast

From: Yang Shi
Date: Thu Sep 01 2022 - 13:51:08 EST


On Thu, Sep 1, 2022 at 9:46 AM David Hildenbrand <david@xxxxxxxxxx> wrote:
>
> On 01.09.22 18:40, Peter Xu wrote:
> > On Thu, Sep 01, 2022 at 06:34:41PM +0200, David Hildenbrand wrote:
> >> On 01.09.22 18:28, Peter Xu wrote:
> >>> On Thu, Sep 01, 2022 at 09:21:19AM +0200, David Hildenbrand wrote:
> >>>> commit 4b471e8898c3 ("mm, thp: remove infrastructure for handling splitting
> >>>> PMDs") didn't remove all details about the THP split requirements for
> >>>> RCU GUP-fast.
> >>>>
> >>>> IPI broeadcasts on THP split are no longer required.
> >>>>
> >>>> Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> >>>> Cc: Sasha Levin <sasha.levin@xxxxxxxxxx>
> >>>> Cc: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxxxxxxx>
> >>>> Cc: Vlastimil Babka <vbabka@xxxxxxx>
> >>>> Cc: Jerome Marchand <jmarchan@xxxxxxxxxx>
> >>>> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> >>>> Cc: Hugh Dickins <hughd@xxxxxxxxxx>
> >>>> Cc: Jason Gunthorpe <jgg@xxxxxxxxxx>
> >>>> Cc: John Hubbard <jhubbard@xxxxxxxxxx>
> >>>> Cc: Peter Xu <peterx@xxxxxxxxxx>
> >>>> Cc: Yang Shi <shy828301@xxxxxxxxx>
> >>>> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>
> >>>> ---
> >>>> mm/gup.c | 5 ++---
> >>>> 1 file changed, 2 insertions(+), 3 deletions(-)
> >>>>
> >>>> diff --git a/mm/gup.c b/mm/gup.c
> >>>> index 5abdaf487460..cfe71f422787 100644
> >>>> --- a/mm/gup.c
> >>>> +++ b/mm/gup.c
> >>>> @@ -2309,9 +2309,8 @@ EXPORT_SYMBOL(get_user_pages_unlocked);
> >>>> *
> >>>> * Another way to achieve this is to batch up page table containing pages
> >>>> * belonging to more than one mm_user, then rcu_sched a callback to free those
> >>>> - * pages. Disabling interrupts will allow the fast_gup walker to both block
> >>>> - * the rcu_sched callback, and an IPI that we broadcast for splitting THPs
> >>>> - * (which is a relatively rare event). The code below adopts this strategy.
> >>>> + * pages. Disabling interrupts will allow the fast_gup walker to block the
> >>>> + * rcu_sched callback.
> >>>
> >>> This is the comment for fast-gup in general but not only for thp split.
> >>
> >> "an IPI that we broadcast for splitting THP" is about splitting THP.
> >
> > Ah OK. Shall we still keep some "IPI broadcast" information here if we're
> > modifying it? Otherwise it gives a feeling that none needs the IPIs.
>
> I guess that's the end goal -- and we forgot about the PMD collapse case.
>
> Are we aware of any other case that needs an IPI? I'd rather avoid
> documenting something that's no longer true.
>
> >
> > It can be dropped later if you want to rework the thp collapse side and
> > finally remove IPI dependency on fast-gup, but so far it seems to me it's
> > still needed. Or just drop this patch until that rework happens?
>
> The doc as is is obviously stale, why drop this patch?
>
> We should see a fix for the THP collapse issue very soon I guess. Most
> probably this patch will go upstream after that fix.

I will be working on the fix.

>
> >
> >>
> >>>
> >>> I can understand that we don't need IPI for thp split, but isn't the IPIs
> >>> still needed for thp collapse (aka pmdp_collapse_flush)?
> >>
> >> That was, unfortunately, never documented -- and as discussed in the
> >> other thread, arm64 doesn't do that IPI before collapse and might need
> >> fixing. We'll most probably end up getting rid of that
> >> (undocumented/forgotten) IPI requirement and fix it in GUP-fast by
> >> re-rechecking if the PMD changed.
> >
> > Yeah from an initial thought that looks valid to me. It'll also allow
> > pmdp_collapse_flush() to be dropped too, am I right?
>
> I think the magic about pmdp_collapse_flush() is not only the IPIs, but
> that we don't perform an ordinary PMD flush but we logically flush "all
> PTEs in that range".

Yeah, because THP collapse does copy the data before clearing pte. If
we want to remove pmdp_collapse_flush() by just clearing pmd, we
should clear *AND* flush pte before copying the data IIRC.

>
> Apparently, that's a difference on some architectures.
>
>
> --
> Thanks,
>
> David / dhildenb
>