Re: [PATCH] init: fix allocated page overlapping with PTR_ERR

From: Björn Töpel
Date: Thu Apr 18 2024 - 08:41:50 EST


Nam Cao <namcao@xxxxxxxxxxxxx> writes:

> On 2024-04-18 Nam Cao wrote:
>> There is nothing preventing kernel memory allocators from allocating a
>> page that overlaps with PTR_ERR(), except for architecture-specific
>> code that setup memblock.
>>
>> It was discovered that RISCV architecture doesn't setup memblock
>> corectly, leading to a page overlapping with PTR_ERR() being allocated,
>> and subsequently crashing the kernel (link in Close: )
>>
>> The reported crash has nothing to do with PTR_ERR(): the last page
>> (at address 0xfffff000) being allocated leads to an unexpected
>> arithmetic overflow in ext4; but still, this page shouldn't be
>> allocated in the first place.
>>
>> Because PTR_ERR() is an architecture-independent thing, we shouldn't
>> ask every single architecture to set this up. There may be other
>> architectures beside RISCV that have the same problem.
>>
>> Fix this one and for all by reserving the physical memory page that
>> may be mapped to the last virtual memory page as part of low memory.
>>
>> Unfortunately, this means if there is actual memory at this reserved
>> location, that memory will become inaccessible. However, if this page
>> is not reserved, it can only be accessed as high memory, so this
>> doesn't matter if high memory is not supported. Even if high memory is
>> supported, it is still only one page.
>>
>> Closes: https://lore.kernel.org/linux-riscv/878r1ibpdn.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>> Signed-off-by: Nam Cao <namcao@xxxxxxxxxxxxx>
>> Cc: <stable@xxxxxxxxxxxxxxx> # all versions
>
> Sorry, forgot to add:
> Reported-by: Björn Töpel <bjorn@xxxxxxxxxx>

Hmm, can't we get rid of the whole check in arch/riscv/mm/init.c for
32b?

--8<--
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index fe8e159394d8..1e91d5728887 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -196,7 +196,6 @@ early_param("mem", early_mem);
static void __init setup_bootmem(void)
{
phys_addr_t vmlinux_end = __pa_symbol(&_end);
- phys_addr_t max_mapped_addr;
phys_addr_t phys_ram_end, vmlinux_start;

if (IS_ENABLED(CONFIG_XIP_KERNEL))
@@ -234,21 +233,6 @@ static void __init setup_bootmem(void)
if (IS_ENABLED(CONFIG_64BIT))
kernel_map.va_pa_offset = PAGE_OFFSET - phys_ram_base;

- /*
- * memblock allocator is not aware of the fact that last 4K bytes of
- * the addressable memory can not be mapped because of IS_ERR_VALUE
- * macro. Make sure that last 4k bytes are not usable by memblock
- * if end of dram is equal to maximum addressable memory. For 64-bit
- * kernel, this problem can't happen here as the end of the virtual
- * address space is occupied by the kernel mapping then this check must
- * be done as soon as the kernel mapping base address is determined.
- */
- if (!IS_ENABLED(CONFIG_64BIT)) {
- max_mapped_addr = __pa(~(ulong)0);
- if (max_mapped_addr == (phys_ram_end - 1))
- memblock_set_current_limit(max_mapped_addr - 4096);
- }
-
min_low_pfn = PFN_UP(phys_ram_base);
max_low_pfn = max_pfn = PFN_DOWN(phys_ram_end);
high_memory = (void *)(__va(PFN_PHYS(max_low_pfn)));
--8<--

Mike hints that's *not* the case
(https://lore.kernel.org/linux-riscv/ZiAkRMUfiPDUGPdL@xxxxxxxxxx/).
memblock_reserve() should disallow allocation as well, no?

Thanks, and FWIW:

Tested-by: Björn Töpel <bjorn@xxxxxxxxxxxx>