Re: [RFC PATCH 00/22] riscv: s64ilp32: Running 32-bit Linux kernel on 64-bit supervisor mode

From: Palmer Dabbelt
Date: Fri May 19 2023 - 13:18:57 EST


On Fri, 19 May 2023 09:53:35 PDT (-0700), Arnd Bergmann wrote:
On Fri, May 19, 2023, at 17:31, Guo Ren wrote:
On Fri, May 19, 2023 at 2:29 AM Arnd Bergmann <arnd@xxxxxxxx> wrote:
On Thu, May 18, 2023, at 17:38, Palmer Dabbelt wrote:
> On Thu, 18 May 2023 06:09:51 PDT (-0700), guoren@xxxxxxxxxx wrote:

If for some crazy reason you'd still want the 64ilp32 ABI in user
space, running the kernel this way is probably still a bad idea,
but that one is less clear. There is clearly a small memory
penalty of running a 64-bit kernel for larger data structures
(page, inode, task_struct, ...) and vmlinux, and there is no
I don't think it's a small memory penalty, our measurement is about
16% with defconfig, see "Why 32-bit Linux?" section.

This patch series doesn't add 64ilp32 userspace abi, but it seems you
also don't like to run 32-bit Linux kernel on 64-bit hardware, right?

Ok, I'm sorry for missing the important bit here. So if this can
still use the normal 32-bit user space, the cost of this patch set
is not huge, and it's something that can be beneficial in a few
cases, though I suspect most users are still better off running
64-bit kernels.

Running a normal 32-bit userspace would require HW support for the 32-bit mode switch for userspace, though (rv32 isn't a subset of rv64, so there's nothing we can do to make those binaries function correctly with uABI). The userspace-only mode switch is a bit simpler than the user+supervisor switch, but it seems like vendors who really want the memory savings would just implement both mode switches.

The motivation of s64ilp32 (running 32-bit Linux kernel on 64-bit s-mode):
- The target hardware (Canaan Kendryte k230) only supports MXL=64,
SXL=64, UXL=64/32.
- The 64-bit Linux + compat 32-bit app can't satisfy the 64/128MB scenarios.

huge additional maintenance cost on top of the ABI itself
that you'd need either way, but using a 64-bit address space
in the kernel has some important advantages even when running
32-bit userland: processes can use the entire 4GB virtual
space, while the kernel can address more than 768MB of lowmem,
and KASLR has more bits to work with for randomization. On
RISCV, some additional features (VMAP_STACK, KASAN, KFENCE,
...) depend on 64-bit kernels even though they don't
strictly need that.

I agree that the 64-bit linux kernel has more functionalities, but:
- What do you think about linux on a 64/128MB SoC? Could it be
affordable to VMAP_STACK, KASAN, KFENCE?

I would definitely recommend VMAP_STACK, but that can be implemented
and is used on other 32-bit architectures (ppc32, arm32) without a
huge cost. The larger virtual user address space can help even on
machines with 128MB, though most applications probably don't care at
that point.

At least having them as an option seems reasonable. Historically we haven't gated new base systems on having every feature the others do, though (!MMU, rv32, etc).

- I think 32-bit Linux & RTOS have monopolized this market (64/128MB
scenarios), right?

The minimum amount of RAM that makes a system usable for Linux is
constantly going up, so I think with 64MB, most new projects are
already better off running some RTOS kernel instead of Linux.
The ones that are still usable today probably won't last a lot
of distro upgrades before the bloat catches up with them, but I
can see how your patch set can give them a few extra years of
updates.

We also have 32-bit kernel support. Systems that have tens of MB of RAM tend to end up with some memory technology that doesn't scale to gigabytes these days, and since that's fixed when the chip is built it seems like those folks would be better off just having HW support for 32-bit kernels (and maybe not even bothering with HW support for 64-bit kernels).

For the 256MB+ systems, I would expect the sensitive kernel
allocations to be small enough that the series makes little
difference. The 128MB systems are the most interesting ones
here, and I'm curious to see where you spot most of the
memory usage differences, I'll also reply to your initial
mail for that.

Thanks. I agree we need to see some real systems that benefit from this, as it's a pretty big support cost. Just defconfig sizes doesn't mean a whole lot, as users on these very constrained systems aren't likely to run defconfig anyway.

If someone's going to use it then I'm fine taking the code, it just seems like a very thin set of possible use cases. We've already got almost no users in RISC-V land, I've got a feeling this is esoteric enough to actually have zero.


Arnd