Re: 8aeb879baf12 - significant system call latency regression, bisected
From: H. Peter Anvin
Date: Thu Jun 18 2026 - 19:03:41 EST
On 2026-06-16 06:53, David Laight wrote:
>
> Basically you can't win.
> I was looking at why a patch didn't give the expected performance gain
> on a different base kernel build.
> It seems to depend on whether the function (actually strlen) was aligned
> to an odd or even 16 byte boundary.
> If aligned to an even boundary the loop inside the function crossed a
> 'significant' boundary and the code ran measurably slower.
> If you start aligning loop tops and labels in general you probably lose
> due to code bloat.
> (Here the loop didn't need aligning, it just needed not to contain
> the relevant boundary.)
>
> In this case the extra padding will change the alignment of everything that
> follows - and some of those might make a difference as well.
>
> You'd need to add extra code further down the function to keep the size
> the same (and hope the compiler keeps the functions in the same order).
>
This is true, but this is why we want to at least be selective about it.
Padding every single function generates code bloat, *and* it is a compile-time
option which means that only people using a kernel built for that target will
benefit.
Hence this is better confined to specific ultra-hot entry point.
-hpa