Re: [PATCH 16/24] x86/mm/kaiser: Use PCID feature to make user and kernel switches faster
From: Dave Hansen
Date: Tue Nov 28 2017 - 02:52:16 EST
On 11/27/2017 09:22 PM, Andy Lutomirski wrote:
> On Mon, Nov 27, 2017 at 2:49 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>> From: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
>>
>> Short summary: Use x86 PCID feature to avoid flushing the TLB at all
>> interrupts and syscalls. Speed them up. Makes context switches
>> and TLB flushing slower.
>
> I suspect that, if we actually did it right (by doing a deferred
> usermode flush when we switch CR3), it wouldn't make context switches
> or TLB flushing slower.
Totally agree.
>> + /*
>> + * On systems with PCIDs, but no INVPCID, the only
>> + * way to flush a PCID is a CR3 write. Note that
>> + * we use the kernel page tables with the *user*
>> + * ASID here.
>> + */
>> + unsigned long user_asid_flush_cr3;
>> + user_asid_flush_cr3 = build_cr3(pgd, user_asid(kern_asid));
>> + write_cr3(user_asid_flush_cr3);
>
> This is wrong. If we could atomically switch CR3 and switch it back
> without any speculative fills in the mean time, we might be okay, but
> that's not what's happening. We could be filling the TLB with
> usermode-tagged kernel entries, which is a big no-no.
This hunk should disappear. It was an attempt to support PCIDs without
INVPCID. But, as you point out, that needs even more infrastructure to
support perfectly. It's yet another reason not to support that
configuration for now.
Just to make sure I understand your concern (because I'm sure we'll get
back to this soon): this code is functionally correct: it will ensure
that TLB entries in the user ASID _are_ flushed.
But, I think what you're saying is that any TLB fills between this and
the reload of the kernel ASID might fill parts of the kernel page tables
into the TLB (with the user ASID), and _those_ could be used to weaken
KASLR. I agree, that's theoretically problematic.
Am I missing anything?