Re: [PATCH 00/11] [v5] Use global pages with PTI
From: Tom Lendacky
Date: Mon Apr 09 2018 - 14:04:40 EST
On 4/6/2018 3:55 PM, Dave Hansen wrote:
> Changes from v4
> * Fix compile error reported by Tom Lendacky
This built with CONFIG_RANDOMIZE_BASE=y, but failed to boot successfully.
I think you're missing the initialization of __default_kernel_pte_mask in
kaslr.c.
Thanks,
Tom
> * Avoid setting _PAGE_GLOBAL on non-present entries
>
> Changes from v3:
> * Fix whitespace issue noticed by willy
> * Clarify comments about X86_FEATURE_PGE checks
> * Clarify commit message around the necessity of _PAGE_GLOBAL
> filtering when CR4.PGE=0 or PGE is unsupported.
>
> Changes from v2:
>
> * Add performance numbers to changelogs
> * Fix compile error resulting from use of x86-specific
> __default_kernel_pte_mask in arch-generic mm/early_ioremap.c
> * Delay kernel text cloning until after we are done messing
> with it (patch 11).
> * Blacklist K8 explicitly from mapping all kernel text as
> global (this should never happen because K8 does not use
> pti when pti=auto, but we on the safe side). (patch 11)
>
> --
>
> The later versions of the KAISER patches (pre-PTI) allowed the
> user/kernel shared areas to be GLOBAL. The thought was that this would
> reduce the TLB overhead of keeping two copies of these mappings.
>
> During the switch over to PTI, we seem to have lost our ability to have
> GLOBAL mappings. This adds them back.
>
> To measure the benefits of this, I took a modern Atom system without
> PCIDs and ran a microbenchmark[1] (higher is better):
>
> No Global Lines (baseline ): 6077741 lseeks/sec
> 88 Global Lines (kern entry): 7528609 lseeks/sec (+23.9%)
> 94 Global Lines (all ktext ): 8433111 lseeks/sec (+38.8%)
>
> On a modern Skylake desktop with PCIDs, the benefits are tangible, but not
> huge:
>
> No Global pages (baseline): 15783951 lseeks/sec
> 28 Global pages (this set): 16054688 lseeks/sec
> +270737 lseeks/sec (+1.71%)
>
> I also double-checked with a kernel compile on the Skylake system (lower
> is better):
>
> No Global pages (baseline): 186.951 seconds time elapsed ( +- 0.35% )
> 28 Global pages (this set): 185.756 seconds time elapsed ( +- 0.09% )
> -1.195 seconds (-0.64%)
>
> 1. https://github.com/antonblanchard/will-it-scale/blob/master/tests/lseek1.c
>
> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> Cc: Andy Lutomirski <luto@xxxxxxxxxx>
> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Cc: Kees Cook <keescook@xxxxxxxxxx>
> Cc: Hugh Dickins <hughd@xxxxxxxxxx>
> Cc: Juergen Gross <jgross@xxxxxxxx>
> Cc: x86@xxxxxxxxxx
> Cc: Nadav Amit <namit@xxxxxxxxxx>
>