[tip: x86/mm] x86/mm: Increase pgt_buf size for 5-level page tables

From: tip-bot2 for Lorenzo Stoakes
Date: Mon Jan 04 2021 - 13:30:21 EST


The following commit has been merged into the x86/mm branch of tip:

Commit-ID: 167dcfc08b0b1f964ea95d410aa496fd78adf475
Gitweb: https://git.kernel.org/tip/167dcfc08b0b1f964ea95d410aa496fd78adf475
Author: Lorenzo Stoakes <lstoakes@xxxxxxxxx>
AuthorDate: Tue, 15 Dec 2020 20:56:41
Committer: Borislav Petkov <bp@xxxxxxx>
CommitterDate: Mon, 04 Jan 2021 18:07:50 +01:00

x86/mm: Increase pgt_buf size for 5-level page tables

pgt_buf is used to allocate page tables on initial direct page mapping
which bootstraps the kernel into being able to allocate these before the
direct mapping makes further pages available.

INIT_PGD_PAGE_COUNT is set to 6 pages (doubled for KASLR) - 3 (PUD, PMD,
PTE) for the 1 MiB ISA mapping and 3 more for the first direct mapping
assignment in each case providing 2 MiB of address space.

This has not been updated for 5-level page tables which has an
additional P4D page table level above PUD.

In most instances, this will not have a material impact as the first
4 page levels allocated for the ISA mapping will provide sufficient
address space to encompass all further address mappings.

If the first direct mapping is within 512 GiB of the ISA mapping, only
a PMD and PTE needs to be added in the instance the kernel is using 4
KiB page tables (e.g. CONFIG_DEBUG_PAGEALLOC is enabled) and only a PMD
if the kernel can use 2 MiB pages (the first allocation is limited to
PMD_SIZE so a GiB page cannot be used there).

However, if the machine has more than 512 GiB of RAM and the kernel is
allocating 4 KiB page size, 3 further page tables are required.

If the machine has more than 256 TiB of RAM at 4 KiB or 2 MiB page size,
further 3 or 4 page tables are required respectively.

Update INIT_PGD_PAGE_COUNT to reflect this.

[ bp: Sanitize text into passive voice without ambiguous personal pronouns. ]

Signed-off-by: Lorenzo Stoakes <lstoakes@xxxxxxxxx>
Signed-off-by: Borislav Petkov <bp@xxxxxxx>
Acked-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
Acked-by: Dave Hansen <dave.hansen@xxxxxxxxx>
Link: https://lkml.kernel.org/r/20201215205641.34096-1-lstoakes@xxxxxxxxx
---
arch/x86/mm/init.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index e26f5c5..dd694fb 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -157,16 +157,25 @@ __ref void *alloc_low_pages(unsigned int num)
}

/*
- * By default need 3 4k for initial PMD_SIZE, 3 4k for 0-ISA_END_ADDRESS.
- * With KASLR memory randomization, depending on the machine e820 memory
- * and the PUD alignment. We may need twice more pages when KASLR memory
+ * By default need to be able to allocate page tables below PGD firstly for
+ * the 0-ISA_END_ADDRESS range and secondly for the initial PMD_SIZE mapping.
+ * With KASLR memory randomization, depending on the machine e820 memory and the
+ * PUD alignment, twice that many pages may be needed when KASLR memory
* randomization is enabled.
*/
+
+#ifndef CONFIG_X86_5LEVEL
+#define INIT_PGD_PAGE_TABLES 3
+#else
+#define INIT_PGD_PAGE_TABLES 4
+#endif
+
#ifndef CONFIG_RANDOMIZE_MEMORY
-#define INIT_PGD_PAGE_COUNT 6
+#define INIT_PGD_PAGE_COUNT (2 * INIT_PGD_PAGE_TABLES)
#else
-#define INIT_PGD_PAGE_COUNT 12
+#define INIT_PGD_PAGE_COUNT (4 * INIT_PGD_PAGE_TABLES)
#endif
+
#define INIT_PGT_BUF_SIZE (INIT_PGD_PAGE_COUNT * PAGE_SIZE)
RESERVE_BRK(early_pgt_alloc, INIT_PGT_BUF_SIZE);
void __init early_alloc_pgt_buf(void)