Re: [PATCH 02/17] mm: mmu_gather rework

From: Peter Zijlstra
Date: Wed Mar 16 2011 - 14:57:44 EST


On Thu, 2011-03-10 at 15:50 +0000, Mel Gorman wrote:

> > +static inline void
> > +tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, unsigned int full_mm_flush)
> > {
>
> checkpatch will bitch about line length.

I did a s/full_mm_flush/fullmm/ which puts the line length at 81. At
which point I'll ignore it ;-)

> > - struct mmu_gather *tlb = &get_cpu_var(mmu_gathers);
> > -
> > tlb->mm = mm;
> >
> > - /* Use fast mode if only one CPU is online */
> > - tlb->nr = num_online_cpus() > 1 ? 0U : ~0U;
> > + tlb->max = ARRAY_SIZE(tlb->local);
> > + tlb->pages = tlb->local;
> > +
> > + if (num_online_cpus() > 1) {
> > + tlb->nr = 0;
> > + __tlb_alloc_page(tlb);
> > + } else /* Use fast mode if only one CPU is online */
> > + tlb->nr = ~0U;
> >
> > tlb->fullmm = full_mm_flush;
> >
> > - return tlb;
> > +#ifdef HAVE_ARCH_MMU_GATHER
> > + tlb->arch = ARCH_MMU_GATHER_INIT;
> > +#endif
> > }
> >
> > static inline void
> > -tlb_flush_mmu(struct mmu_gather *tlb, unsigned long start, unsigned long end)
> > +tlb_flush_mmu(struct mmu_gather *tlb)
>
> Removing start/end here is a harmless, but unrelated cleanup. Is it
> worth keeping start/end on the rough off-chance the information is ever
> used to limit what portion of the TLB is flushed?

I've got another patch that adds full range tracking to
asm-generic/tlb.h, it uses tlb_remove_tlb_entry()/p.._free_tlb() to
track the range of the things actually removed.

> > {
> > if (!tlb->need_flush)
> > return;
> > @@ -75,6 +95,8 @@ tlb_flush_mmu(struct mmu_gather *tlb, un
> > if (!tlb_fast_mode(tlb)) {
> > free_pages_and_swap_cache(tlb->pages, tlb->nr);
> > tlb->nr = 0;
> > + if (tlb->pages == tlb->local)
> > + __tlb_alloc_page(tlb);
> > }
>
> That needs a comment. Something like
>
> /*
> * If we are using the local on-stack array of pages for MMU gather,
> * try allocation again as we have recently freed pages
> */

Fair enough, done.

> > }
> >

> > @@ -98,16 +121,24 @@ tlb_finish_mmu(struct mmu_gather *tlb, u
> > * handling the additional races in SMP caused by other CPUs caching valid
> > * mappings in their TLBs.
> > */
> > -static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
> > +static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
> > {
>
> What does this return value mean?

Like you surmise below, that we need to call tlb_flush_mmu() before
calling more of __tlb_remove_page().

> Looking at the function, its obvious that 1 is returned when pages[] is full
> and needs to be freed, TLB flushed, etc. However, callers refer the return
> value as "need_flush" where as this function sets tlb->need_flush but the
> two values have different meaning: retval need_flush means the array is full
> and must be emptied where as tlb->need_flush just says there are some pages
> that need to be freed.
>
> It's a nit-pick but how about having it return the number of array slots
> that are still available like what pagevec_add does? It would allow you
> to get rid of the slighty-different need_flush variable in mm/memory.c

That might work, let me do so.

> > tlb->need_flush = 1;
> > if (tlb_fast_mode(tlb)) {
> > free_page_and_swap_cache(page);
> > - return;
> > + return 0;
> > }
> > tlb->pages[tlb->nr++] = page;
> > - if (tlb->nr >= FREE_PTE_NR)
> > - tlb_flush_mmu(tlb, 0, 0);
> > + if (tlb->nr >= tlb->max)
> > + return 1;
> > +
>
> Use == and VM_BUG_ON(tlb->nr > tlb->max) ?

Paranoia, I like ;-)

> > + return 0;
> > +}
> > +

> > @@ -974,7 +975,7 @@ static unsigned long zap_pte_range(struc
> > page_remove_rmap(page);
> > if (unlikely(page_mapcount(page) < 0))
> > print_bad_pte(vma, addr, ptent, page);
> > - tlb_remove_page(tlb, page);
> > + need_flush = __tlb_remove_page(tlb, page);
> > continue;
>
> So, if __tlb_remove_page() returns 1 (should be bool for true/false) the
> caller is expected to call tlb_flush_mmu(). We call continue and as a
> side-effect break out of the loop unlocking various bits and pieces and
> restarted.
>
> It'd be a hell of a lot clearer to just say
>
> if (__tlb_remove_page(tlb, page))
> break;
>
> and not check !need_flush on each iteration.

Uhm,. right :-), /me wonders why he wrote it like it was.

> > }
> > /*
> > @@ -995,12 +996,20 @@ static unsigned long zap_pte_range(struc
> > print_bad_pte(vma, addr, ptent, NULL);
> > }
> > pte_clear_not_present_full(mm, addr, pte, tlb->fullmm);
> > - } while (pte++, addr += PAGE_SIZE, (addr != end && *zap_work > 0));
> > + } while (pte++, addr += PAGE_SIZE,
> > + (addr != end && *zap_work > 0 && !need_flush));
> >
> > add_mm_rss_vec(mm, rss);
> > arch_leave_lazy_mmu_mode();
> > pte_unmap_unlock(pte - 1, ptl);
> >
> > + if (need_flush) {
> > + need_flush = 0;
> > + tlb_flush_mmu(tlb);
> > + if (addr != end)
> > + goto again;
> > + }
>
> So, I think the reasoning here is to update counters and release locks
> regularly while tearing down pagetables. If this is true, it could do with
> a comment explaining that's the intention. You can also obviate the need
> for the local need_flush here with just if (tlb->need_flush), right?

I'll add a comment. tlb->need_flush is not quite the same, its set as
soon as there's one page in, our need_flush is when there's no space
left. I should have spotted this confusion before.


>
> Functionally I didn't see any problems. Comments are more about form
> than function. Whether you apply them or not
>
> Acked-by: Mel Gorman <mel@xxxxxxxxx>

Thanks!

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/