Re: [PATCH V2] arm64/mm: Create level specific section mappings in map_range()
From: Mark Rutland
Date: Mon Mar 10 2025 - 10:42:40 EST
On Mon, Mar 10, 2025 at 01:48:53PM +0000, Ryan Roberts wrote:
> On 10/03/2025 10:55, Mark Rutland wrote:
> > On Mon, Mar 10, 2025 at 11:58:12AM +0530, Anshuman Khandual wrote:
> >> Currently PMD section mapping mask i.e PMD_TYPE_SECT is used while creating
> >> section mapping at all page table levels except the last level. This works
> >> fine as the section mapping masks are exactly the same (0x1UL) for all page
> >> table levels.
> >>
> >> This will change in the future with D128 page tables that have unique skip
> >> level values (SKL) required for creating section mapping at different page
> >> table levels. Hence use page table level specific section mapping macros
> >> instead of the common PMD_TYPE_SECT. While here also ensure that a section
> >> mapping is only created on page table levels which could support that on a
> >> given page size configuration otherwise fall back to create table entries.
> >>
> >> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> >> Cc: Will Deacon <will@xxxxxxxxxx>
> >> Cc: Ryan Roberts <ryan.roberts@xxxxxxx>
> >> Cc: Ard Biesheuvel <ardb@xxxxxxxxxx>
> >> Cc: linux-kernel@xxxxxxxxxxxxxxx
> >> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> >> Signed-off-by: Anshuman Khandual <anshuman.khandual@xxxxxxx>
> >> ---
> >> This patch applies on v6.14-rc6
> >>
> >> Changes in V2:
> >>
> >> - Dropped PGD_TYPE_SECT macro and its instance from map_range()
> >> - Create table entries on levels where section mapping is not possible
> >
> > On the last version, there was talk of an extant bug:
> >
> > https://lore.kernel.org/all/5f1b36e2-6455-44d9-97b0-253aefd5024f@xxxxxxx/
> >
> > ... did that turn out to not be the case?
>
> In the absence of Ard (or anyone else) telling me it's not a bug, then I'll
> continue to assert that it is a bug, and should have a Fixes tag, and Cc stable.
Ok -- can we please have a description of the bug, and a fix
specifically for that, before we try to reshape this for D128?
IIUC the problem is that for a sufficiently large (and aligned) extent
of physical memory, we might try to create a block mapping larger than
the HW supports?
With that in mind, my earlier comment about LPA2 and blocks at level-0
is particularly relevant...
> >> Changes in V1:
> >>
> >> https://lore.kernel.org/all/20250303041834.2796751-1-anshuman.khandual@xxxxxxx/
> >>
> >> arch/arm64/kernel/pi/map_range.c | 38 +++++++++++++++++++++++++++++---
> >> 1 file changed, 35 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/arch/arm64/kernel/pi/map_range.c b/arch/arm64/kernel/pi/map_range.c
> >> index 2b69e3beeef8..25a70c675c4d 100644
> >> --- a/arch/arm64/kernel/pi/map_range.c
> >> +++ b/arch/arm64/kernel/pi/map_range.c
> >> @@ -11,6 +11,22 @@
> >>
> >> #include "pi.h"
> >>
> >> +static bool sect_supported(int level)
> >
> > What exactly is this function supposed to indicate?
> >
> > Due to the 'default' case, it'll return true for level-3 tables where
> > sections/blocks aren't possible, but pages are. So maybe this is just
> > misnamed, and wants to be something like "leaf_supported()" ?
>
> I always thought a "section" was (outdated?) linux terminology for a leaf
> mapping at any level. For my understanding, I think you're saying it
> specifically means a leaf mapping at any level other than the last level, so it
> is actually equivalent to a "block" mapping in the Arm ARM?
Yep -- "section" is old style terminology that's (roughly) equivalent to
"block" at level 2. When the arm64 port got written, the
"section"/"sect" terminology was inherited to mean "block" at any level.
Ideally we'd clean that up.
Historically ARM had "small pages", "large pages", "sections", and
"supersections".
See the ARMv7-A ARM (ARM DDI 0406C.d), page B3-1321.
> Either way, the intended semantic for this function is "is it valid to create a
> leaf mapping at the specified level?" where leaf = block or page.
Cool. With that in mind, having "leaf" in the name would be better.
Though TBH, maybe it makes more sense to split this up into separate
levels (as with the code in arch/arm64/mm/mmu.c) and handle each level
explicitly with a separate function. That way each level's check can be
handled within that level's function, and we can use the right types,
etc. It feels like that might be better for D128 in future?
Mark.
>
> Thanks,
> Ryan
>
> >
> >> +{
> >> + switch (level) {
> >> + case -1:
> >> + return false;
> >> + case 0:
> >> + if (IS_ENABLED(CONFIG_ARM64_16K_PAGES) ||
> >> + IS_ENABLED(CONFIG_ARM64_64K_PAGES))
> >> + return false;
> >> + else
> >> + return true;
> >
> > Surely:
> >
> > case 0:
> > return IS_ENABLED(CONFIG_ARM64_4K_PAGES);
> >
> > ... though that's only the case when LPA2 is supported and TCR_ELx.DS is
> > set.
> >
> >> + default:
> >> + return true;
> >
> > As above, what is this supposed to return for level-3 tables where
> > sections/blocks aren't possible?
> >
> > Can we BUILD_BUG() for bogus levels?
> >
> >> + }
> >> +}
> >> +
> >> /**
> >> * map_range - Map a contiguous range of physical pages into virtual memory
> >> *
> >> @@ -44,13 +60,29 @@ void __init map_range(u64 *pte, u64 start, u64 end, u64 pa, pgprot_t prot,
> >> * Set the right block/page bits for this level unless we are
> >> * clearing the mapping
> >> */
> >> - if (protval)
> >> - protval |= (level < 3) ? PMD_TYPE_SECT : PTE_TYPE_PAGE;
> >> + if (protval && sect_supported(level)) {
> >
> > The existing code is trying to find the leaf prot for a given level of
> > table, so as-is checking sect_supported() is at best misleading.
> >
> > If we're going to do something different below based on
> > sect_supported(), I reckon this should be moved into the path where
> > sect_supported() returned true, maybe factored out into a helper for
> > clarity.
> >
> > Mark.
> >
> >> + switch (level) {
> >> + case 3:
> >> + protval |= PTE_TYPE_PAGE;
> >> + break;
> >> + case 2:
> >> + protval |= PMD_TYPE_SECT;
> >> + break;
> >> + case 1:
> >> + protval |= PUD_TYPE_SECT;
> >> + break;
> >> + case 0:
> >> + protval |= P4D_TYPE_SECT;
> >> + break;
> >> + default:
> >> + break;
> >> + }
> >> + }
> >>
> >> while (start < end) {
> >> u64 next = min((start | lmask) + 1, PAGE_ALIGN(end));
> >>
> >> - if (level < 3 && (start | next | pa) & lmask) {
> >> + if ((level < 3 && (start | next | pa) & lmask) || !sect_supported(level)) {
> >> /*
> >> * This chunk needs a finer grained mapping. Create a
> >> * table mapping if necessary and recurse.
> >> --
> >> 2.25.1
> >>
> >>
>