Hi,
I'm embarrassed to release this patch in such a state, but I am
because a) I won't have much time to work on it in the short term;
b) it would take a lot of work to polish so I'd like to see what
people think before going too far; c) so I have something other than
boring lockless pagecache to talk about at Ottawa.
The basic idea is this: replace the heavyweight per-CPU mmu_gather
structure with a lightweight stack based one which is missing the
big page vector. Instead of the vector, use Linux pagetables to
store the pages-to-be-freed. Pages and pagetables are first unmapped,
then tlbs are flushed, then pages and pagetables are freed.
There is a downside: walking the page table can be anywhere from
slightly to a lot less efficient than walking the vector, depending
on density, and this adds a 2nd pagetable walk to unmapping (but
removes the vector walk, of course).
Upsides: mmu_gather is preemptible, horrible mmu_gather breaking
code can be removed, artificial disparity between PREEMPT tlb
flush batching and non-PREEMPT disappears (preempt can now have
good performance and non-preempt can have good latency). tlb flush
batching is possibly much closer to perfect though on non-PREEMPT
that may not be noticable (for PREEMPT, it appears to be spending
5x less time in tlb flushing on kbuild)
Caveats:
- nonlinear mappings don't work yet
- hugepages don't work yet
- i386 only