Re: [LSF/MM/BPF TOPIC] Per-process page size

From: Arnd Bergmann

Date: Fri Feb 20 2026 - 04:51:38 EST


On Tue, Feb 17, 2026, at 16:22, Matthew Wilcox wrote:
> On Tue, Feb 17, 2026 at 08:20:26PM +0530, Dev Jain wrote:
>>
>> - Are there other arches which could benefit from this?
>
> Some architectures walk the page tables entirely in software, but on the
> other hand, those tend to be, er, "legacy" architectures these days and
> it's doubtful that anybody would invest in adding support.
>
> Sounds like a good question for Arnd ;-)

I think Loongarch and RISC-V are the candidates for doing whatever
Arm does here. MIPS and PowerPC64 could do it in theory, but it's
less clear that someone will spend the effort here.

>> - Rough edges of compatibility layer - pfnmaps, ksm, procfs, etc. For
>> example, what happens when a 64K process opens a procfs file of
>> a 4K process?

This would also be my main concern. There are hundreds of device drivers
that implement a custom .mmap() file operation, and a few dozen file
systems, all of which need to be audited and likely changed to allow
mapping larger granules.

>> - native pgtable implementation - perhaps inspiration can be taken
>> from other arches with an involved pgtable logic (ppc, s390)?
>
> I question who decides what page size a particular process will use.
> The programmer? The sysadmin?

I would expect this to be done by a combination of these two, it
seems simple enough to have a wrapper like numactl or setarch to
start an application one way or another.


Another concern I have is for the actual performance trade-offs here.
As I understand it, the idea is to have most of the memory size
advantages of a 4KB page kernel, and most of the performance
advantages of a 64KB page kernel for the special applications that
care about this. However, the same is true for 16KB page kernel,
which also aims for the same trade-off with a much simpler model
and a different set of compatibility problems.

Do we expect per-process page size kernels to actually be better
than fixed 16KB page kernels, and better enough that it's worth
the added complexity? In particular, this approach would likely
only get the advantages of the TLB but not the file systems using
larger pages, while also suffering from the extra overhead of
compacting smaller pages in order to map them.

Arnd