Re: [RFC 00/10] x86 TLB flush cleanups, moving toward PCID support

From: Andy Lutomirski
Date: Tue May 09 2017 - 08:43:35 EST


On Mon, May 8, 2017 at 9:36 AM, Nadav Amit <nadav.amit@xxxxxxxxx> wrote:
>
>> On May 7, 2017, at 5:38 AM, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>>
>> As I've been working on polishing my PCID code, a major problem I've
>> encountered is that there are too many x86 TLB flushing code paths and
>> that they have too many inconsequential differences. The result was
>> that earlier versions of the PCID code were a colossal mess and very
>> difficult to understand.
>>
>> This series goes a long way toward cleaning up the mess. With all the
>> patches applied, there is a single function that contains the meat of
>> the code to flush the TLB on a given CPU, and all the tlb flushing
>> APIs call it for both local and remote CPUs.
>>
>> This series should only adversely affect the kernel in a couple of
>> minor ways:
>>
>> - It makes smp_mb() unconditional when flushing TLBs. We used to
>> use the TLB flush itself to mostly avoid smp_mb() on the initiating
>> CPU.
>>
>> - On UP kernels, we lose the dubious optimization of inlining nerfed
>> variants of all the TLB flush APIs. This bloats the kernel a tiny
>> bit, although it should increase performance, since the SMP
>> versions were better.
>>
>> Patch 10 in here is a little bit off topic. It's a cleanup that's
>> also needed before PCID can go in, but it's not directly about
>> TLB flushing.
>>
>> Thoughts?
>
> In general I like the changes. I needed to hack Linux TLB shootdowns for
> a research project just because I could not handle the code otherwise.
> I ended up doing some of changes that you have done.
>
> I just have two general comments:
>
> - You may want to consider merging the kernel mappings invalidation
> with the userspace mappings invalidations as well, since there are
> still code redundancies.
>

Hmm. The code for kernel mappings is quite short, and I'm not sure
how well it would fit in if I tried to merge it.

> - Donât expect too much from concurrent TLB invalidations. In many
> cases the IPI latency dominates the overhead from my experience.
>

Fair enough.