Re: riscv32 EXT4 splat, 6.8 regression?

From: Mike Rapoport
Date: Wed Apr 17 2024 - 15:35:51 EST


On Wed, Apr 17, 2024 at 12:36:39AM +0200, Nam Cao wrote:
> On 2024-04-16 Mike Rapoport wrote:
> > On Tue, Apr 16, 2024 at 06:00:29PM +0100, Matthew Wilcox wrote:
> > > On Tue, Apr 16, 2024 at 07:31:54PM +0300, Mike Rapoport wrote:
> > > > > - if (!IS_ENABLED(CONFIG_64BIT)) {
> > > > > - max_mapped_addr = __pa(~(ulong)0);
> > > > > - if (max_mapped_addr == (phys_ram_end - 1))
> > > > > - memblock_set_current_limit(max_mapped_addr - 4096);
> > > > > - }
> > > > > + memblock_reserve(__pa(-PAGE_SIZE), PAGE_SIZE);
> > > >
> > > > Ack.
> > >
> > > Can this go to generic code instead of letting architecture maintainers
> > > fall over it?
> >
> > Yes, it's just have to happen before setup_arch() where most architectures
> > enable memblock allocations.
>
> This also works, the reported problem disappears.
>
> However, I am confused about one thing: doesn't this make one page of
> physical memory inaccessible?
>
> Is it better to solve this by setting max_low_pfn instead? Then at
> least the page is still accessible as high memory.

It could be if riscv had support for HIGHMEM.

> Best regards,
> Nam
>
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index fa34cf55037b..6e3130cae675 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -197,7 +197,6 @@ early_param("mem", early_mem);
> static void __init setup_bootmem(void)
> {
> phys_addr_t vmlinux_end = __pa_symbol(&_end);
> - phys_addr_t max_mapped_addr;
> phys_addr_t phys_ram_end, vmlinux_start;
>
> if (IS_ENABLED(CONFIG_XIP_KERNEL))
> @@ -235,23 +234,9 @@ static void __init setup_bootmem(void)
> if (IS_ENABLED(CONFIG_64BIT))
> kernel_map.va_pa_offset = PAGE_OFFSET - phys_ram_base;
>
> - /*
> - * memblock allocator is not aware of the fact that last 4K bytes of
> - * the addressable memory can not be mapped because of IS_ERR_VALUE
> - * macro. Make sure that last 4k bytes are not usable by memblock
> - * if end of dram is equal to maximum addressable memory. For 64-bit
> - * kernel, this problem can't happen here as the end of the virtual
> - * address space is occupied by the kernel mapping then this check must
> - * be done as soon as the kernel mapping base address is determined.
> - */
> - if (!IS_ENABLED(CONFIG_64BIT)) {
> - max_mapped_addr = __pa(~(ulong)0);
> - if (max_mapped_addr == (phys_ram_end - 1))
> - memblock_set_current_limit(max_mapped_addr - 4096);

To be precisely strict about the conflict between mapping a page at
0xfffff000 and IS_ERR_VALUE, memblock should not allocate the that page, so
memblock_set_current_limit() should remain. It does not need all the
surrounding if, though just setting the limit for -PAGE_SIZE should do.

Although I suspect that this call to memblock_set_current_limit() is what
caused the splat in ext4. Without that limit enforcement, the last page
would be the first one memblock allocates and it most likely would have
ended in the kernel page tables and would never be checked for IS_ERR. With
the limit set that page made it to the buddy and got allocated by the code
that actually does IS_ERR checks.

> - }
> -
> min_low_pfn = PFN_UP(phys_ram_base);
> - max_low_pfn = max_pfn = PFN_DOWN(phys_ram_end);
> + max_pfn = PFN_DOWN(phys_ram_end);
> + max_low_pfn = min(max_pfn, PFN_DOWN(__pa(-PAGE_SIZE)));
> high_memory = (void *)(__va(PFN_PHYS(max_low_pfn)));
>
> dma32_phys_limit = min(4UL * SZ_1G, (unsigned long)PFN_PHYS(max_low_pfn));

--
Sincerely yours,
Mike.