Re: [patch] mm: reduce pagetable-freeing latencies
From: Benjamin Herrenschmidt
Date: Tue Jul 24 2007 - 17:29:58 EST
On Tue, 2007-07-24 at 14:13 +0200, Andi Kleen wrote:
> Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx> writes:
>
> > > What a truly putrid patch. I am suspecting that this was a quick
> > > get-you-out-of-trouble thing, which then got forgotten about.
> > >
> > > We have two months to do the "right fix". Please?
> >
> > Working on it...
>
> Ideally the patch would DTRT even on non preemptible kernels,
> aka do cond_resched()s when needed.
First is to rework the batch structure to make it more manageable. That
is, patch #1 will keep the page list in per-cpu (and thus non-preempt),
but the batch "head" will be on the stack.
Now, there are two approaches regarding getting rid of the
get_cpu/put_cpu:
- One is to have a small number of entries for the page list in the
batch structure on the stack, and attempt to gfp' a page for more. If
that fails, we can still free, though with less batching, using only the
few entries in the batch struct itself. That's Hugh initial appraoch
iirc.
- Another is to hook up with those folks who've been asking for a
notifier that we are being preempted/scheduled out. In this case, I can
happily access the per-cpu list, and just trigger a batch flush if we
happen to be scheduled out.
I tend to prefer the former solution though, gfp should be fast, and
there is no need to force a flush if we get scheduled out. It would be
rare to hit the worst case scenario of falling back to the few page
heads in the batch itself. On the other hand, that solution has the
problem of bloating the stack a bit (with the few page pointers) even in
the case where I plan to use the extended batch outside of zap_*, such
as fork, mprotect, ....
So I'll first do patch #1, which will not fix the problem, but will make
the fix easier to fit in, in the meantime, please provide feedback of
your preferred solution for avoiding the get/put_cpu of the 2 above,
unless you find a good 3rd one.
Cheers,
Ben.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/