Re: [PATCH 0/3] Highmem support for 32-bit RISC-V

From: Arnd Bergmann
Date: Thu Apr 02 2020 - 05:31:58 EST


On Tue, Mar 31, 2020 at 11:34 AM Eric Lin <tesheng@xxxxxxxxxxxxx> wrote:
>
> With Highmem support, the kernel can map more than 1GB physical memory.
>
> This patchset implements Highmem for RV32, referencing to mostly nds32
> and others like arm and mips, and it has been tested on Andes A25MP platform.

I would much prefer to not see highmem added to new architectures at all
if possible, see https://lwn.net/Articles/813201/ for some background.

For the arm32 architecture, we are thinking about implementing a
VMPLIT_4G_4G option to replace highmem in the long run. The most
likely way this would turn out at the moment looks like:

- have a 256MB region for vmalloc space at the top of the 4GB address
space, containing vmlinux, module, mmio mappings and vmalloc
allocations

- have 3.75GB starting at address zero for either user space or the
linear map.

- reserve one address space ID for kernel mappings to avoid tlb flushes
during normal context switches

- On any kernel entry, switch the page table to the one with the linear
mapping, and back to the user page table before returning to user space

- add a generic copy_from_user/copy_to_user implementation based
on get_user_pages() in asm-generic/uaccess.h, using memcpy()
to copy from/to the page in the linear map.

- possible have architectures override get_user/put_user to use a
cheaper access based on a page table switch to read individual
words if that is cheaper than get_user_pages().

There was an implementation of this for x86 a long time ago, but
it never got merged, mainly because there were no ASIDs on x86
at the time and the TLB flushing during context switch were really
expensive. As far as I can tell, all of the modern embedded cores
do have ASIDs, and unlike x86, most do not support more than 4GB
of physical RAM, so this scheme can work to replace highmem
in most of the remaining cases, and provide additional benefits
(larger user address range, higher separate of kernel/user addresses)
at a relatively small performance cost.

Arnd