Re: [RFC PATCH] x86/mm: Flush more aggressively in lazy TLB mode

From: Markus Trippelsdorf
Date: Tue Oct 10 2017 - 04:23:05 EST


On 2017.10.09 at 09:50 -0700, Andy Lutomirski wrote:
> Since commit 94b1b03b519b, x86's lazy TLB mode has been all the way
> lazy: when running a kernel thread (including the idle thread), the
> kernel keeps using the last user mm's page tables without attempting
> to maintain user TLB coherence at all. From a pure semantic
> perspective, this is fine -- kernel threads won't attempt to access
> user pages, so having stale TLB entries doesn't matter.
>
> Unfortunately, I forgot about a subtlety. By skipping TLB flushes,
> we also allow any paging-structure caches that may exist on the CPU
> to become incoherent. This means that we can have a
> paging-structure cache entry that references a freed page table, and
> the CPU is within its rights to do a speculative page walk starting
> at the freed page table.
>
> I can imagine this causing two different problems:
>
> - A speculative page walk starting from a bogus page table could read
> IO addresses. I haven't seen any reports of this causing problems.
>
> - A speculative page walk that involves a bogus page table can install
> garbage in the TLB. Such garbage would always be at a user VA, but
> some AMD CPUs have logic that triggers a machine check when it notices
> these bogus entries. I've seen a couple reports of this.
>
> Reinstate TLB coherence in lazy mode. With this patch applied, we
> do it in one of two ways. If we have PCID, we simply switch back to
> init_mm's page tables when we enter a kernel thread -- this seems to
> be quite cheap except for the cost of serializing the CPU. If we
> don't have PCID, then we set a flag and switch to init_mm the first
> time we would otherwise need to flush the TLB.

Your patch fixes the problem. (I've stressed my AMD machine in various
ways since yesterday. No issues thus far.)
Thanks.

--
Markus