Re: [PATCH v5 7/7] mm: Remove the now-unnecessary mmget_still_valid() hack
From: Hugh Dickins
Date: Mon Aug 31 2020 - 16:36:36 EST
On Mon, 31 Aug 2020, Jann Horn wrote:
> On Mon, Aug 31, 2020 at 8:07 AM Hugh Dickins <hughd@xxxxxxxxxx> wrote:
> > On Thu, 27 Aug 2020, Jann Horn wrote:
> >
> > > The preceding patches have ensured that core dumping properly takes the
> > > mmap_lock. Thanks to that, we can now remove mmget_still_valid() and all
> > > its users.
> >
> > Hi Jann, while the only tears to be shed over losing mmget_still_valid()
> > will be tears of joy, I think you need to explain why you believe it's
> > safe to remove the instance in mm/khugepaged.c: which you'll have found
> > I moved just recently, to cover an extra case (sorry for not Cc'ing you).
> >
> > > --- a/mm/khugepaged.c
> > > +++ b/mm/khugepaged.c
> > > @@ -431,7 +431,7 @@ static void insert_to_mm_slots_hash(struct mm_struct *mm,
> > >
> > > static inline int khugepaged_test_exit(struct mm_struct *mm)
> > > {
> > > - return atomic_read(&mm->mm_users) == 0 || !mmget_still_valid(mm);
> > > + return atomic_read(&mm->mm_users) == 0;
> > > }
> > >
> > > static bool hugepage_vma_check(struct vm_area_struct *vma,
> >
> > The movement (which you have correctly followed) was in
> > bbe98f9cadff ("khugepaged: khugepaged_test_exit() check mmget_still_valid()")
> > but the "pmd .. physical page 0" issue is explained better in its parent
> > 18e77600f7a1 ("khugepaged: retract_page_tables() remember to test exit")
> >
> > I think your core dumping is still reading the page tables without
> > holding mmap_lock
>
> Where? get_dump_page() takes mmap_lock now:
> <https://lore.kernel.org/lkml/20200827114932.3572699-7-jannh@xxxxxxxxxx/>
Right, sorry for the noise, that's precisely what 6/7 is all about,
and properly declares itself there in its Subject - I plead that I
got distracted by the vma snapshot part of the series, and paid too
little attention before bleating.
Looks good to me - thanks.
>
> I don't think there should be any paths into __get_user_pages() left
> that don't hold the mmap_lock. Actually, we should probably try
> sticking mmap_assert_locked() in there now as a follow-up?
Maybe: I haven't given it thought, to be honest.
Hugh