Re: [RFC PATCH] binfmt_elf: Align eligible read-only PT_LOAD segments to PMD_SIZE for THP
From: hev
Date: Tue Mar 03 2026 - 02:01:01 EST
On Tue, Mar 3, 2026 at 1:32 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> On Tue, Mar 03, 2026 at 12:31:59PM +0800, hev wrote:
> > On Tue, Mar 3, 2026 at 12:46 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> > >
> > > On Mon, Mar 02, 2026 at 11:50:46PM +0800, WANG Rui wrote:
> > > > +config ELF_RO_LOAD_THP_ALIGNMENT
> > > > + bool "Align read-only ELF load segments for THP (EXPERIMENTAL)"
> > > > + depends on READ_ONLY_THP_FOR_FS
> > >
> > > This doesn't deserve a config option.
> >
> > This optimization is not entirely free. Increasing PT_LOAD alignment
> > can waste virtual address space, which is especially significant on
> > 32-bit systems, and it also reduces ASLR entropy by limiting the
> > number of possible load addresses.
> >
> > In addition, coarser alignment may have secondary microarchitectural
> > effects (eg. on indirect branch prediction), depending on the
> > workload. Because this change affects address space layout and
> > security-related properties, providing users with a way to opt out is
> > reasonable, rather than making it completely unconditional. This
> > behavior fits naturally under READ_ONLY_THP_FOR_FS.
>
> This isn't reasonable at all. You're asking distro maintainers to make
> a decision they have insufficient information to make. Almost none of
> our users compile their own kernels, and frankly those that do don't have
> enough information to make an informed decision about which way to choose.
>
> So if we're going to have a way to opt in/out, it needs to be something
> different. Maybe a heuristic based on size of text segment? Maybe an
> ELF flag? But then, if we're going to modify the binary, why not just
> set p_align and then we don't need this patch at all?
I agree that a compile-time config is not a good fit here, and I’m
fine with dropping it in v2.
Relying on ELF-side changes is problematic. Increasing p_align in the
linker inflates file size due to extra padding, and more importantly
it cannot help existing binaries. The loader is therefore the only
place where this can be done without ABI changes or file size
regressions.
The logic here is deliberately strict rather than heuristic: the
segment must be read-only, at least PMD_SIZE in length, and PMD_SIZE
is capped at 32MB to avoid pathological address space waste. If these
conditions are not met, the layout is unchanged.
I don’t see a reliable way to make a smarter decision at load time
without workload knowledge. With READ_ONLY_THP_FOR_FS already limiting
the scope and the THP policy applied at runtime, this keeps the
behavior constrained.
Thanks,
Rui