Re: [PATCH v2] riscv: keep 32-bit kernel to 32-bit phys_addr_t

From: Palmer Dabbelt
Date: Fri Jan 10 2020 - 20:48:44 EST


On Fri, 10 Jan 2020 17:14:46 PST (-0800), Paul Walmsley wrote:
On Fri, 10 Jan 2020, Palmer Dabbelt wrote:

On Mon, 06 Jan 2020 15:20:24 PST (-0800), Olof Johansson wrote:
> While rv32 technically has 34-bit physical addresses, no current platforms
> use it and it's likely to shake out driver bugs.
>
> Let's keep 64-bit phys_addr_t off on 32-bit builds until one shows up,
> since other work will be needed to make such a system useful anyway.
>
> PHYS_ADDR_T_64BIT is def_bool 64BIT, so just remove the select.
>
> Signed-off-by: Olof Johansson <olof@xxxxxxxxx>
> ---
>
> v2: Just remove the select, since it's set by default if CONFIG_64BIT
>
> arch/riscv/Kconfig | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index a31169b02ec06..569fc6deb94d6 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -12,8 +12,6 @@ config 32BIT
>
> config RISCV
> def_bool y
> - # even on 32-bit, physical (and DMA) addresses are > 32-bits
> - select PHYS_ADDR_T_64BIT
> select OF
> select OF_EARLY_FLATTREE
> select OF_IRQ

I gave 5.5-rc5 a quick test on a 32-bit QEMU with 8GiB of RAM and the system
wouldn't boot, so we've got at least some bugs floating around somewhere.
Given that this doesn't work I don't see any reason to keep it around as an
option, as if someone wants to make it work there's a lot more to do than make
things compile.

I've put this on for-next. If anyone cares about 34-bit physical addresses on
rv32 then now is the right time to speak up... ideally by fixing it :)

You know, if, according to

https://freenode.logbot.info/riscv/20200106

the main reason for doing this is to avoid autobuilder warnings, I'd be
tempted to suggest we leave it in there so people have some incentive to
go fix the real bugs ;-)

(that said, the patch is basically okay by me until at least QEMU is
fixed or hardware appears)

I think it's unlikely to be a QEMU bug, as it starts printing messages from the
payload and then hangs. By the point QEMU is executing instructions it doesn't
really care about pointers any more and is just doing whatever the instructions
say to do. It's possible QEMU decided to alias memory and blow everything up,
but I feel like that would manifest after Linux prints some messages. Writing
this I feel like I've debugged that before...

If I had to bet I'd put the bug in OpenSBI, but only because I don't see any
Linux messages. Specifically, I'd bet that someone did something like

for (size_t i = 0; i < get_memory_size_from_fdt(); i += PAGE_SIZE)
memory_base[i + K], for small values of K

which would blow up for some configurations of 34-bit memory sizes and 32-bit
size_t, given how the compiler likes to optimize things. That definately
happens early in the kernel boot, but IIRC we print a message after mapping a
fixed number of pages (though that code was changed recently so I don't
remember how it works now).

I'd actually tested this before Olof brought up the issue and just put it in
the "I guess just don't do that" pile, but when he mentioned it was breaking
builds I remember chasing around some early kernel boot issue last time I
accidentally turned on a lot of memory and figured it's probably best to just
bail on the whole thing anyway. IIRC part of the feedback on the original
and while it'd be nice to have (which is probably why I ignored the feedback) I
feel like there's more important ways to spend our time -- like getting a
32-bit userspace.

That said, I'd be happy to make this selectable by menuconfig. Presumably it's
not that much work to chase around the drivers that blow up and add something
like "depends 64BIT || PHYS_ADDR_T_32BIT" until randconfig builds stop failing,
but I feel like that's as far as this is likely to go in the forseeable future.

Either way, I guess this is a good reminder it's better to remain causal. I
suppose I'll try to do so in the future :)