Re: [PATCH v5 6/7] x86/tlb: optimizing flush_tlb_mm

From: Luming Yu
Date: Tue May 15 2012 - 08:58:53 EST

On Tue, May 15, 2012 at 5:17 PM, Nick Piggin <npiggin@xxxxxxxxx> wrote:
> On 15 May 2012 19:15, Nick Piggin <npiggin@xxxxxxxxx> wrote:
>> So this should go to linux-arch...
>> On 15 May 2012 18:55, Alex Shi <alex.shi@xxxxxxxxx> wrote:
>>> Not every flush_tlb_mm execution moment is really need to evacuate all
>>> TLB entries, like in munmap, just few 'invlpg' is better for whole
>>> process performance, since it leaves most of TLB entries for later
>>> accessing.
> Did you have microbenchmarks for this like your mprotect numbers,
> by the way? Test munmap numbers and see how that looks. Also,

Might be off topic, but I just spent few minutes to test out the difference
between write CR3 vs. invlpg on a pretty old but still reliable P4 desktop
with my simple hardware latency and bandwidth test tool I posted for
RFC several weeks ago on LKML.

Both __native_flush_tlb() and __native_flush_tlb_single(...)
introduced roughly 1 ns latency to tsc sampling executed in
stop_machine_context in two logical CPUs

Just to fuel the discussion. :-)

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at