Re: [PATCH 04/21] x86/mm/asi: set up asi_nonsensitive_pgd

From: Dave Hansen
Date: Wed Oct 01 2025 - 16:28:24 EST


On 9/24/25 07:59, Brendan Jackman wrote:
> Create the initial shared pagetable to hold all the mappings that will
> be shared among ASI domains.
>
> Mirror the physmap into the ASI pagetables, but with a maximum
> granularity that's guaranteed to allow changing pageblock sensitivity
> without having to allocate pagetables, and with everything as
> non-present.

Could you also talk about what this granularity _actually_ is and why it
has the property of never requiring page table alloc

...
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index e98e85cf15f42db669696ba8195d8fc633351b26..7e0471d46767c63ceade479ae0d1bf738f14904a 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -7,6 +7,7 @@
> * Copyright (C) 2002,2003 Andi Kleen <ak@xxxxxxx>
> */
>
> +#include <linux/asi.h>
> #include <linux/signal.h>
> #include <linux/sched.h>
> #include <linux/kernel.h>
> @@ -746,7 +747,8 @@ phys_pgd_init(pgd_t *pgd_page, unsigned long paddr_start, unsigned long paddr_en
> {
> unsigned long vaddr, vaddr_start, vaddr_end, vaddr_next, paddr_last;
>
> - *pgd_changed = false;
> + if (pgd_changed)
> + *pgd_changed = false;

This 'pgd_changed' hunk isn't mentioned in the changelog.

...
> @@ -797,6 +800,24 @@ __kernel_physical_mapping_init(unsigned long paddr_start,
>
> paddr_last = phys_pgd_init(init_mm.pgd, paddr_start, paddr_end, page_size_mask,
> prot, init, &pgd_changed);
> +
> + /*
> + * Set up ASI's unrestricted physmap. This needs to mapped at minimum 2M
> + * size so that regions can be mapped and unmapped at pageblock
> + * granularity without requiring allocations.
> + */

This took me a minute to wrap my head around.

Here, I think you're trying to convey that:

1. There's a higher-level design decision that all sensitivity will be
done at a 2M granularity. A 2MB physical region is either sensitive
or not.
2. Because of #1, 1GB mappings are not cool because splitting a 1GB
mapping into 2MB needs to allocate a page table page.
3. 4k mappings are OK because they can also have their permissions
changed at a 2MB granularity. It's just more laborious.

The "minimum 2M size" comment really threw me off because that, to me,
also includes 1G which is a no-no here.

I also can't help but wonder if it would have been easier and more
straightforward to just start this whole exercise at 4k: force all the
ASI tables to be 4k. Then, later, add the 2MB support and tie to
pageblocks on after.


> + if (asi_nonsensitive_pgd) {
> + /*
> + * Since most memory is expected to end up sensitive, start with
> + * everything unmapped in this pagetable.
> + */
> + pgprot_t prot_np = __pgprot(pgprot_val(prot) & ~_PAGE_PRESENT);
> +
> + VM_BUG_ON((PAGE_SHIFT + pageblock_order) < page_level_shift(PG_LEVEL_2M));
> + phys_pgd_init(asi_nonsensitive_pgd, paddr_start, paddr_end, 1 << PG_LEVEL_2M,
> + prot_np, init, NULL);
> + }

I'm also kinda wondering what the purpose is of having a whole page
table full of !_PAGE_PRESENT entries. It would be nice to know how this
eventually gets turned into something useful.