Re: [PATCH v3 00/60] arm64: Add support for LPA2 at stage1 and WXN

From: Ryan Roberts
Date: Tue Mar 07 2023 - 11:28:30 EST


Hi Ard,

Just to say that I plan to work my way through this lot over the next couple of
weeks. I hope you can tolerate comments dribbling in as I go?

I'll also try integrating this with my latest revision for the KVM side of
things and re-run all my tests. I'll report back in due course.

Thanks,
Ryan


On 07/03/2023 14:04, Ard Biesheuvel wrote:
> This is a followup to [0], which was a lot smaller. Thanks to Ryan for
> feedback and review. This series is independent from Ryan's work on
> adding support for LPA2 to KVM - the only potential source of conflict
> should be the patch "arm64: kvm: Limit HYP VA and host S2 range to 48
> bits when LPA2 is in effect", which could simply be dropped in favour of
> the KVM changes to make it support LPA2.
>
> The first ~15 patches of this series rework how the kernel VA space is
> organized, so that the vmemmap region does not take up more space than
> necessary, and so that most of it can be reclaimed when running a build
> capable of 52-bit virtual addressing on hardware that is not. This is
> needed because the vmemmap region will take up a substantial part of the
> upper VA region that it shares with the kernel, modules and
> vmalloc/vmap mappings once we enable LPA2 with 4k pages.
>
> The next ~30 patches rework the early init code, reimplementing most of
> the page table and relocation handling in C code. There are several
> reasons why this is beneficial:
> - we generally prefer C code over asm for these things, and the macros
> that currently exist in head.S for creating the kernel pages tables
> are a good example why;
> - we no longer need to create the kernel mapping in two passes, which
> means we can remove the logic that copies parts of the fixmap and the
> KAsan shadow from one set of page tables to the other; this is
> especially advantageous for KAsan with LPA2, which needs more
> elaborate shadow handling across multiple levels, since the KAsan
> region cannot be placed on exact pgd_t bouundaries in that case;
> - we can read the ID registers and parse command line overrides before
> creating the page tables, which simplifies the LPA2 case, as flicking
> the global TCR_EL1.DS bit at a later stage would require elaborate
> repainting of all page table descriptors, some of which with the MMU
> disabled;
> - we can use more elaborate logic to create the mappings, which means we
> can use more precise mappings for code and data sections even when
> using 2 MiB granularity, and this is a prerequisite for running with
> WXN.
>
> As part of the ID map changes, we decouple the ID map size from the
> kernel VA size, and switch to a 48-bit VA map for all configurations.
>
> The next 18 patches rework the existing LVA support as a CPU feature,
> which simplifies some code and gets rid of the vabits_actual variable.
> Then, LPA2 support is implemented in the same vein. This requires adding
> support for 5 level paging as well, given that LPA2 introduces a new
> paging level '-1' when using 4k pages.
>
> Combined with the vmemmap changes at the start of the series, the
> resulting LPA2/4k pages configuration will have the exact same VA space
> layout as the ordinary 4k/4 levels configuration, and so LPA2 support
> can reasonably be enabled by default, as the fallback is seamless on
> non-LPA2 hardware.
>
> In the 16k/LPA2 case, the fallback also reduces the number of paging
> levels, resulting in a 47-bit VA space. This is based on the assumption
> that hybrid LPA2/non-LPA2 16k pages kernels in production use would
> prefer not to take the performance hit of 4 level paging to gain only a
> single additional bit of VA space. (Note that generic Android kernels
> use only 3 levels of paging today.) Bespoke 16k configurations can still
> configure 48-bit virtual addressing as before.
>
> Finally, the last two patches enable support for running with the WXN
> control enabled. This was previously part of a separate series [1], but
> given that the delta is tiny, it is included here as well.
>
> [0] https://lore.kernel.org/all/20221124123932.2648991-1-ardb@xxxxxxxxxx/
> [1] https://lore.kernel.org/all/20221111171201.2088501-1-ardb@xxxxxxxxxx/
>
> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> Cc: Will Deacon <will@xxxxxxxxxx>
> Cc: Marc Zyngier <maz@xxxxxxxxxx>
> Cc: Mark Rutland <mark.rutland@xxxxxxx>
> Cc: Ryan Roberts <ryan.roberts@xxxxxxx>
> Cc: Anshuman Khandual <anshuman.khandual@xxxxxxx>
> Cc: Kees Cook <keescook@xxxxxxxxxxxx>
>
> Anshuman Khandual (2):
> arm64/mm: Add FEAT_LPA2 specific TCR_EL1.DS field
> arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2]
>
> Ard Biesheuvel (57):
>
> // KASLR / vmemmap reorg
> arm64: kernel: Disable latent_entropy GCC plugin in early C runtime
> arm64: mm: Take potential load offset into account when KASLR is off
> arm64: mm: get rid of kimage_vaddr global variable
> arm64: mm: Move PCI I/O emulation region above the vmemmap region
> arm64: mm: Move fixmap region above vmemmap region
> arm64: ptdump: Allow VMALLOC_END to be defined at boot
> arm64: ptdump: Discover start of vmemmap region at runtime
> arm64: vmemmap: Avoid base2 order of struct page size to dimension
> region
> arm64: mm: Reclaim unused vmemmap region for vmalloc use
> arm64: kaslr: Adjust randomization range dynamically
> arm64: kaslr: drop special case for ThunderX in kaslr_requires_kpti()
> arm64: kvm: honour 'nokaslr' command line option for the HYP VA space
>
> // Reimplement page table creation code in C
> arm64: kernel: Manage absolute relocations in code built under pi/
> arm64: kernel: Don't rely on objcopy to make code under pi/ __init
> arm64: head: move relocation handling to C code
> arm64: idreg-override: Omit non-NULL checks for override pointer
> arm64: idreg-override: Prepare for place relative reloc patching
> arm64: idreg-override: Avoid parameq() and parameqn()
> arm64: idreg-override: avoid strlen() to check for empty strings
> arm64: idreg-override: Avoid sprintf() for simple string concatenation
> arm64: idreg-override: Avoid kstrtou64() to parse a single hex digit
> arm64: idreg-override: Move to early mini C runtime
> arm64: kernel: Remove early fdt remap code
> arm64: head: Clear BSS and the kernel page tables in one go
> arm64: Move feature overrides into the BSS section
> arm64: head: Run feature override detection before mapping the kernel
> arm64: head: move dynamic shadow call stack patching into early C
> runtime
> arm64: kaslr: Use feature override instead of parsing the cmdline
> again
> arm64: idreg-override: Create a pseudo feature for rodata=off
> arm64: Add helpers to probe local CPU for PAC/BTI/E0PD support
> arm64: head: allocate more pages for the kernel mapping
> arm64: head: move memstart_offset_seed handling to C code
> arm64: head: Move early kernel mapping routines into C code
> arm64: mm: Use 48-bit virtual addressing for the permanent ID map
> arm64: pgtable: Decouple PGDIR size macros from PGD/PUD/PMD levels
> arm64: kernel: Create initial ID map from C code
> arm64: mm: avoid fixmap for early swapper_pg_dir updates
> arm64: mm: omit redundant remap of kernel image
> arm64: Revert "mm: provide idmap pointer to cpu_replace_ttbr1()"
>
> // Implement LPA2 support
> arm64: mm: Handle LVA support as a CPU feature
> arm64: mm: Add feature override support for LVA
> arm64: mm: Wire up TCR.DS bit to PTE shareability fields
> arm64: mm: Add LPA2 support to phys<->pte conversion routines
> arm64: mm: Add definitions to support 5 levels of paging
> arm64: mm: add LPA2 and 5 level paging support to G-to-nG conversion
> arm64: Enable LPA2 at boot if supported by the system
> arm64: mm: Add 5 level paging support to fixmap and swapper handling
> arm64: kasan: Reduce minimum shadow alignment and enable 5 level
> paging
> arm64: mm: Add support for folding PUDs at runtime
> arm64: ptdump: Disregard unaddressable VA space
> arm64: ptdump: Deal with translation levels folded at runtime
> arm64: kvm: avoid CONFIG_PGTABLE_LEVELS for runtime levels
> arm64: kvm: Limit HYP VA and host S2 range to 48 bits when LPA2 is in
> effect
> arm64: Enable 52-bit virtual addressing for 4k and 16k granule configs
> arm64: defconfig: Enable LPA2 support
>
> // Allow WXN control to be enabled at boot
> mm: add arch hook to validate mmap() prot flags
> arm64: mm: add support for WXN memory translation attribute
>
> Marc Zyngier (1):
> arm64: Turn kaslr_feature_override into a generic SW feature override
>
> arch/arm64/Kconfig | 34 +-
> arch/arm64/configs/defconfig | 2 +-
> arch/arm64/include/asm/assembler.h | 55 +--
> arch/arm64/include/asm/cpufeature.h | 102 +++++
> arch/arm64/include/asm/fixmap.h | 1 +
> arch/arm64/include/asm/kasan.h | 2 -
> arch/arm64/include/asm/kernel-pgtable.h | 104 ++---
> arch/arm64/include/asm/memory.h | 50 +--
> arch/arm64/include/asm/mman.h | 36 ++
> arch/arm64/include/asm/mmu.h | 26 +-
> arch/arm64/include/asm/mmu_context.h | 49 ++-
> arch/arm64/include/asm/pgalloc.h | 53 ++-
> arch/arm64/include/asm/pgtable-hwdef.h | 33 +-
> arch/arm64/include/asm/pgtable-prot.h | 18 +-
> arch/arm64/include/asm/pgtable-types.h | 6 +
> arch/arm64/include/asm/pgtable.h | 229 +++++++++-
> arch/arm64/include/asm/scs.h | 34 +-
> arch/arm64/include/asm/setup.h | 3 -
> arch/arm64/include/asm/sysreg.h | 2 +
> arch/arm64/include/asm/tlb.h | 3 +-
> arch/arm64/kernel/Makefile | 7 +-
> arch/arm64/kernel/cpu_errata.c | 2 +-
> arch/arm64/kernel/cpufeature.c | 90 ++--
> arch/arm64/kernel/head.S | 465 ++------------------
> arch/arm64/kernel/idreg-override.c | 322 --------------
> arch/arm64/kernel/image-vars.h | 32 ++
> arch/arm64/kernel/kaslr.c | 4 +-
> arch/arm64/kernel/module.c | 2 +-
> arch/arm64/kernel/pi/Makefile | 28 +-
> arch/arm64/kernel/pi/idreg-override.c | 396 +++++++++++++++++
> arch/arm64/kernel/pi/kaslr_early.c | 78 +---
> arch/arm64/kernel/pi/map_kernel.c | 284 ++++++++++++
> arch/arm64/kernel/pi/map_range.c | 104 +++++
> arch/arm64/kernel/{ => pi}/patch-scs.c | 36 +-
> arch/arm64/kernel/pi/pi.h | 30 ++
> arch/arm64/kernel/pi/relacheck.c | 130 ++++++
> arch/arm64/kernel/pi/relocate.c | 64 +++
> arch/arm64/kernel/setup.c | 22 -
> arch/arm64/kernel/sleep.S | 3 -
> arch/arm64/kernel/suspend.c | 2 +-
> arch/arm64/kernel/vmlinux.lds.S | 14 +-
> arch/arm64/kvm/hyp/nvhe/mem_protect.c | 2 +
> arch/arm64/kvm/mmu.c | 22 +-
> arch/arm64/kvm/va_layout.c | 10 +-
> arch/arm64/mm/init.c | 2 +-
> arch/arm64/mm/kasan_init.c | 154 +++++--
> arch/arm64/mm/mmap.c | 4 +
> arch/arm64/mm/mmu.c | 268 ++++++-----
> arch/arm64/mm/pgd.c | 17 +-
> arch/arm64/mm/proc.S | 106 ++++-
> arch/arm64/mm/ptdump.c | 43 +-
> arch/arm64/tools/cpucaps | 1 +
> include/linux/mman.h | 15 +
> mm/mmap.c | 3 +
> 54 files changed, 2259 insertions(+), 1345 deletions(-)
> delete mode 100644 arch/arm64/kernel/idreg-override.c
> create mode 100644 arch/arm64/kernel/pi/idreg-override.c
> create mode 100644 arch/arm64/kernel/pi/map_kernel.c
> create mode 100644 arch/arm64/kernel/pi/map_range.c
> rename arch/arm64/kernel/{ => pi}/patch-scs.c (89%)
> create mode 100644 arch/arm64/kernel/pi/pi.h
> create mode 100644 arch/arm64/kernel/pi/relacheck.c
> create mode 100644 arch/arm64/kernel/pi/relocate.c
>