Re: [PATCH v5 6/7] x86/tlb: optimizing flush_tlb_mm

From: Luming Yu
Date: Tue May 15 2012 - 09:08:03 EST

On Tue, May 15, 2012 at 8:58 PM, Luming Yu <luming.yu@xxxxxxxxx> wrote:
> On Tue, May 15, 2012 at 5:17 PM, Nick Piggin <npiggin@xxxxxxxxx> wrote:
>> On 15 May 2012 19:15, Nick Piggin <npiggin@xxxxxxxxx> wrote:
>>> So this should go to linux-arch...
>>> On 15 May 2012 18:55, Alex Shi <alex.shi@xxxxxxxxx> wrote:
>>>> Not every flush_tlb_mm execution moment is really need to evacuate all
>>>> TLB entries, like in munmap, just few 'invlpg' is better for whole
>>>> process performance, since it leaves most of TLB entries for later
>>>> accessing.
>> Did you have microbenchmarks for this like your mprotect numbers,
>> by the way? Test munmap numbers and see how that looks. Also,
> Might be off topic, but I just spent few minutes to test out the difference
> between write CR3 vs. invlpg on a pretty old but still reliable P4 desktop
> with my simple hardware latency and bandwidth test tool I posted for
> RFC several weeks ago on LKML.
> Both __native_flush_tlb() and __native_flush_tlb_single(...)
> introduced roughly 1 ns Âlatency to tsc sampling executed in

sorry, typo, 1us.. but I should capture nanosecond data. :-(

> stop_machine_context in two logical CPUs
> Just to fuel the discussion. :-)
> Cheers,
> /l
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at