Re: [PATCH] .text page fault SMP scalability optimization
From: Benjamin Herrenschmidt
Date: Sun Oct 30 2005 - 01:48:10 EST
On Wed, 2005-10-19 at 11:06 +0200, Andrea Arcangeli wrote:
> On Wed, Oct 19, 2005 at 01:14:20AM -0700, Andrew Morton wrote:
> > How strange. Do you mena that all CPUs were entering the pagefault handler
> > on behalf of the same pte all the time? That they're staying in lockstep?
>
> Yes.
Nice catch Andrea !
> > If so, there must be a bunch of page_table_lock contention too?
>
> Not really as much. Note also that with latest mainline the ppc64 kernel
> was going well even without this patch, it was some older codebase
> falling apart, primarly because it was still doing pte_establish there
> see:
>
> young_entry = pte_mkyoung(entry);
> if (!pte_same(young_entry, entry)) {
> ptep_establish(vma, address, pte, young_entry);
> update_mmu_cache(vma, address, young_entry);
> lazy_mmu_prot_update(young_entry);
> }
Yes, that's one reason why I introduced ptep_set_access_flags() to be
used when relaxing access permissions to a PTE.
> So those flush actions were a bottleneck when executed by all cpus at
> the same time. But some archs still implement it like with the old
> codebase, and the patch is quite an obvious optimization that can
> clearly avoid useless tlb flushes (and that fixed the problem completely
> with the older codebase still dong ptep_establish even on ppc64).
Note that we should really pass more than just "write_access" from the
arch code. We could make good use of "execute" in some cases as well,
also knowing wether this is a real fault or the result of
get_user_pages() (in some case, the former could use more agressive TLB
pre-loading, not the later). Finally, those infos should be passed to
update_mmu_cache().
Ben.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/