Re: [RFC PATCH 1/2] x86/ibpb: Skip IBPB when we switch back to same user process

From: Peter Zijlstra
Date: Thu Jan 25 2018 - 13:19:15 EST


On Thu, Jan 25, 2018 at 09:04:21AM -0800, Andy Lutomirski wrote:
> I haven't tried to fully decipher the patch, but I think the idea is
> wrong. (I think it's the same wrong idea that Rik and I both had and
> that I got into Linus' tree for a while...) The problem is that it's
> not actually correct to run indefinitely in kernel mode using stale
> cached page table data. The stale PTEs themselves are fine, but the
> stale intermediate translations can cause the CPU to speculatively
> load complete garbage into the TLB, and that's bad (and causes MCEs on
> AMD CPUs).

Urggh.. indeed :/

> I think we only really have two choices: tlb_defer_switch_to_init_mm()
> == true and tlb_defer_switch_to_init_mm() == false. The current
> heuristic is to not defer if we have PCID, because loading CR3 is
> reasonably fast.

I just _really_ _really_ hate idle drivers doing leave_mm(). I don't
suppose limiting the !IPI case to just the idle case would be correct
either, because between waking from idle and testing our 'should I have
invalidated' bit it can (however unlikely) speculate into stale TLB
entries too..