Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardownof hugetlbfs shared page tables V2 (resend)

From: Michal Hocko
Date: Thu Aug 02 2012 - 09:53:54 EST


On Thu 02-08-12 14:33:10, Mel Gorman wrote:
> On Thu, Aug 02, 2012 at 02:36:58PM +0200, Michal Hocko wrote:
> > On Thu 02-08-12 08:37:57, Mel Gorman wrote:
> > > On Thu, Aug 02, 2012 at 09:19:34AM +0200, Michal Hocko wrote:
> > [...]
> > > > On the other hand, mine is more coupled with the sharing code so it
> > > > makes the code easier to follow and also makes the sharing more
> > > > effective because racing processes see pmd populated when checking for
> > > > shareable mappings.
> > > >
> > >
> > > It could do with a small comment above huge_pmd_share() explaining that
> > > calling pmd_alloc() under the i_mmap_mutex is necessary to prevent two
> > > parallel faults missing a sharing opportunity with each other but it's
> > > not mandatory.
> >
> > Sure, that's a good idea. What about the following:
> >
> > diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
> > index 40b2500..51839d1 100644
> > --- a/arch/x86/mm/hugetlbpage.c
> > +++ b/arch/x86/mm/hugetlbpage.c
> > @@ -56,7 +56,13 @@ static int vma_shareable(struct vm_area_struct *vma, unsigned long addr)
> > }
> >
> > /*
> > - * search for a shareable pmd page for hugetlb.
> > + * search for a shareable pmd page for hugetlb. In any case calls
> > + * pmd_alloc and returns the corresponding pte. While this not necessary
> > + * for the !shared pmd case because we can allocate the pmd later as
> > + * well it makes the code much cleaner. pmd allocation is essential for
> > + * the shared case though because pud has to be populated inside the
> > + * same i_mmap_mutex section otherwise racing tasks could either miss
> > + * the sharing (see huge_pte_offset) or selected a bad pmd for sharing.
> > */
> > static pte_t*
> > huge_pmd_share(struct mm_struct *mm, unsigned long addr, pud_t *pud)
> >
>
> Looks reasonable to me.

OK, added to the patch. I will send it to Andrew now.

Thanks a lot!
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/