Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

From: Christoph Lameter
Date: Mon Mar 03 2008 - 14:28:38 EST


On Mon, 3 Mar 2008, Nick Piggin wrote:

> Your skeleton is just registering notifiers and saying
>
> /* you fill the hard part in */
>
> If somebody needs a skeleton in order just to register the notifiers,
> then almost by definition they are unqualified to write the hard
> part ;)

Its also providing a locking scheme.

> OK, there are ways to solve it or hack around it. But this is exactly
> why I think the implementations should be kept seperate. Andrea's
> notifiers are coherent, work on all types of mappings, and will
> hopefully match closely the regular TLB invalidation sequence in the
> Linux VM (at the moment it is quite close, but I hope to make it a
> bit closer) so that it requires almost no changes to the mm.

Then put it into the arch code for TLB invalidation. Paravirt ops gives
good examples on how to do that.

> What about a completely different approach... XPmem runs over NUMAlink,
> right? Why not provide some non-sleeping way to basically IPI remote
> nodes over the NUMAlink where they can process the invalidation? If you
> intra-node cache coherency has to run over this link anyway, then
> presumably it is capable.

There is another Linux instance at the remote end that first has to
remove its own ptes. Also would not work for Inifiniband and other
solutions. All the approaches that require evictions in an atomic context
are limiting the approach and do not allow the generic functionality that
we want in order to not add alternate APIs for this.

> Or another idea, why don't you LD_PRELOAD in the MPT library to also
> intercept munmap, mprotect, mremap etc as well as just fork()? That
> would give you similarly "good enough" coherency as the mmu notifier
> patches except that you can't swap (which Robin said was not a big
> problem).

The good enough solution right now is to pin pages by elevating
refcounts.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/