Re: [PATCH] init: fix allocated page overlapping with PTR_ERR

From: Nam Cao
Date: Thu Apr 18 2024 - 09:03:58 EST


On 2024-04-18 Björn Töpel wrote:
> Nam Cao <namcao@xxxxxxxxxxxxx> writes:
>
> > On 2024-04-18 Nam Cao wrote:
> >> There is nothing preventing kernel memory allocators from allocating a
> >> page that overlaps with PTR_ERR(), except for architecture-specific
> >> code that setup memblock.
> >>
> >> It was discovered that RISCV architecture doesn't setup memblock
> >> corectly, leading to a page overlapping with PTR_ERR() being allocated,
> >> and subsequently crashing the kernel (link in Close: )
> >>
> >> The reported crash has nothing to do with PTR_ERR(): the last page
> >> (at address 0xfffff000) being allocated leads to an unexpected
> >> arithmetic overflow in ext4; but still, this page shouldn't be
> >> allocated in the first place.
> >>
> >> Because PTR_ERR() is an architecture-independent thing, we shouldn't
> >> ask every single architecture to set this up. There may be other
> >> architectures beside RISCV that have the same problem.
> >>
> >> Fix this one and for all by reserving the physical memory page that
> >> may be mapped to the last virtual memory page as part of low memory.
> >>
> >> Unfortunately, this means if there is actual memory at this reserved
> >> location, that memory will become inaccessible. However, if this page
> >> is not reserved, it can only be accessed as high memory, so this
> >> doesn't matter if high memory is not supported. Even if high memory is
> >> supported, it is still only one page.
> >>
> >> Closes: https://lore.kernel.org/linux-riscv/878r1ibpdn.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> >> Signed-off-by: Nam Cao <namcao@xxxxxxxxxxxxx>
> >> Cc: <stable@xxxxxxxxxxxxxxx> # all versions
> >
> > Sorry, forgot to add:
> > Reported-by: Björn Töpel <bjorn@xxxxxxxxxx>
>
> Hmm, can't we get rid of the whole check in arch/riscv/mm/init.c for
> 32b?

We can, but that depends on this patch. So my intention is to wait for
this patch to be applied first, because I don't want to bother the
maintainers with dependencies.

> --8<--
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index fe8e159394d8..1e91d5728887 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -196,7 +196,6 @@ early_param("mem", early_mem);
> static void __init setup_bootmem(void)
> {
> phys_addr_t vmlinux_end = __pa_symbol(&_end);
> - phys_addr_t max_mapped_addr;
> phys_addr_t phys_ram_end, vmlinux_start;
>
> if (IS_ENABLED(CONFIG_XIP_KERNEL))
> @@ -234,21 +233,6 @@ static void __init setup_bootmem(void)
> if (IS_ENABLED(CONFIG_64BIT))
> kernel_map.va_pa_offset = PAGE_OFFSET - phys_ram_base;
>
> - /*
> - * memblock allocator is not aware of the fact that last 4K bytes of
> - * the addressable memory can not be mapped because of IS_ERR_VALUE
> - * macro. Make sure that last 4k bytes are not usable by memblock
> - * if end of dram is equal to maximum addressable memory. For 64-bit
> - * kernel, this problem can't happen here as the end of the virtual
> - * address space is occupied by the kernel mapping then this check must
> - * be done as soon as the kernel mapping base address is determined.
> - */
> - if (!IS_ENABLED(CONFIG_64BIT)) {
> - max_mapped_addr = __pa(~(ulong)0);
> - if (max_mapped_addr == (phys_ram_end - 1))
> - memblock_set_current_limit(max_mapped_addr - 4096);
> - }
> -

If you are going to send this, you can add:
Reviewed-by: Nam Cao <namcao@xxxxxxxxxxxxx>

> min_low_pfn = PFN_UP(phys_ram_base);
> max_low_pfn = max_pfn = PFN_DOWN(phys_ram_end);
> high_memory = (void *)(__va(PFN_PHYS(max_low_pfn)));
> --8<--
>
> Mike hints that's *not* the case
> (https://lore.kernel.org/linux-riscv/ZiAkRMUfiPDUGPdL@xxxxxxxxxx/).
> memblock_reserve() should disallow allocation as well, no?

He said it can't be removed if we set max_low_pfn instead of using
memblock_reserve()

If max_low_pfn() is used, then it can be removed:
https://lore.kernel.org/linux-riscv/Zh6n-nvnQbL-0xss@xxxxxxxxxx

Best regards,
Nam