Re: [PATCH] x86/mm/pti: in pti_clone_pgtable() don't increase addr by PUD_SIZE

From: Dave Hansen
Date: Tue Aug 20 2019 - 09:57:20 EST


On 8/20/19 12:51 AM, Song Liu wrote:
> In our x86_64 kernel, pti_clone_pgtable() fails to clone 7 PMDs because
> of this issuse, including PMD for the irq entry table. For a memcache
> like workload, this introduces about 4.5x more iTLB-load and about 2.5x
> more iTLB-load-misses on a Skylake CPU.

I was surprised that this manifests as a performance issue. Usually
messing up PTI page table manipulation means you get to experience the
jobs of debugging triple faults. But, it makes sense if its this line:

/*
* Note that this will undo _some_ of the work that
* pti_set_kernel_image_nonglobal() did to clear the
* global bit.
*/
pti_clone_pgtable(start, end_clone, PTI_LEVEL_KERNEL_IMAGE);

which is restoring the Global bit.

*But*, that shouldn't get hit on a Skylake CPU since those have PCIDs
and shouldn't have a global kernel image. Could you confirm whether
PCIDs are supported on this CPU?

> pud = pud_offset(p4d, addr);
> if (pud_none(*pud)) {
> - addr += PUD_SIZE;
> + addr += PMD_SIZE;
> continue;
> }

Did we also bugger up this code:

pmd = pmd_offset(pud, addr);
if (pmd_none(*pmd)) {
addr += PMD_SIZE;
continue;
}

if we're on 32-bit and this:

#define PTI_LEVEL_KERNEL_IMAGE PTI_CLONE_PTE

and we get a hole walking to a non-PMD-aligned address?