Re: [PATCH 09/10] x86/mm: enable AMD translation cache extensions

From: Peter Zijlstra
Date: Tue Dec 24 2024 - 13:25:42 EST


On Sun, Dec 22, 2024 at 10:37:01AM -0500, Rik van Riel wrote:
> On Sun, 2024-12-22 at 12:38 +0100, Peter Zijlstra wrote:
> > On Sat, Dec 21, 2024 at 11:06:41PM -0500, Rik van Riel wrote:
> > > With AMD TCE (translation cache extensions) only the intermediate
> > > mappings
> >
> > Only the leave mapings, as written this all don't make sense,
>
> Check out page 513 of the AMD manual:
>
> https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/40332.pdf
>
> "Translation Cache Extension (TCE) Bit. Bit 15, read/write. 
>
> Setting this bit to 1 changes how the INVLPG, INVLPGB, and INVPCID
> instructions operate on TLB entries. When this bit is 0, these
> instructions remove the target PTE from the TLB as well as all 
> upper-level table entries that are cached in the TLB, whether or 
> not they are associated with the target PTE. When this bit is set,
> these instructions will remove the target PTE and only those 
> upper-level entries that lead to the target PTE in the page table
> hierarchy, leaving unrelated upper-level entries intact. This may
> provide a performance benefit.
>
> Page table management software must be written in a way that takes 
> this behavior into account. Software that was written for a 
> processor that does not cache upper-level table entries may result 
> in stale entries being incorrectly used for translations when TCE 
> is enabled. Software that is compatible with TCE mode will operate
> in either mode.
>
> For software using INVLPGB to broadcast TLB invalidations, the
> invalidations are controlled by the EFER.TCE value on the processor
> executing the INVLPGB instruction.
>
> Before setting TCE, system software should verify that this feature
> is supported by examining the feature flag CPUID Fn8000_0001_ECX[TCE].
> See Section 3.3 “Processor Feature Identification,” on
> page 71 for information on using the CPUID instruction"

So that makes a ton more sense.

>
> This suggests that:
> 1) TCE does control the "don't make sense" behavior :)

Well, you wrote:

> > With AMD TCE (translation cache extensions) only the intermediate mappings
> > that cover the address range zapped by INVLPG / INVLPGB get invalidated,
> > rather than all intermediate mappings getting zapped at every TLB invalidation.

And I read that like it would zap only the intermediate mappings rather
than the intermediate mappings.

Reading it a wee bit more carefully, I see it's not quite as bad, but
still not very clear.

> 2) Wait, does EFER.TCE need to be set on every CPU
> in the system? Could a system run with TCE set
> on some CPUs, and cleared on another?!

I would imagine it can; I don't think they would recommend anybody do
this though.