Re: [PATCH 07/13] powerpc: Preemptible mmu_gather

From: Benjamin Herrenschmidt
Date: Mon Apr 12 2010 - 21:58:02 EST


On Fri, 2010-04-09 at 14:07 +1000, Nick Piggin wrote:
> > PPC has an extra batching queue to RCU free the actual pagetable
> > allocations, use the ARCH extentions for that for now.
> >
> > For the ppc64_tlb_batch, which tracks the vaddrs to unhash from the
> > hardware hash-table, keep using per-cpu arrays but flush on context
> > switch and use a TIF bit to track the laxy_mmu state.
>
> Hm. Pity powerpc can't just use tlb flush gathering for this batching,
> (which is what it was designed for). Then it could avoid these tricks.
> What's preventing this? Adding a tlb gather for COW case in
> copy_page_range?

We must flush before the pte_lock is released. If not, we end up with
this funny situation:

- PTE is read-only, hash contains a translation for it
- PTE gets cleared & added to the batch, hash not flushed yet
- PTE lock released, maybe even VMA fully removed
- Other CPU takes a write fault, puts in a new PTE
- Hash ends up with duplicates of the vaddr -> arch violation

Now we could get out of that one, I suppose, if we had some kind of way
to force flush any batch pertaining to a given mm before a new valid PTE
can be written, but that doesn't sound such a trivial thing to do.

Any better idea ?

Cheers,
Ben.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/