Re: [PATCH v5 13/36] vmlinux.lds.h: add PGO and AutoFDO input sections
From: Arvind Sankar
Date: Tue Aug 04 2020 - 12:08:56 EST
On Mon, Aug 03, 2020 at 09:45:32PM -0700, Andi Kleen wrote:
> > Why is that? Both .text and .text.hot have alignment of 2^4 (default
> > function alignment on x86) by default, so it doesn't seem like it should
> > matter for packing density. Avoiding interspersing cold text among
>
> You may lose part of a cache line on each unit boundary. Linux has
> a lot of units, some of them small. All these bytes add up.
Separating out .text.unlikely, which isn't aligned, slightly _reduces_
this loss, but not by much -- just over 1K on a defconfig. More
importantly, it moves cold code out of line (~320k on a defconfig),
giving better code density for the hot code.
For .text and .text.hot, you lose the alignment padding on every
function boundary, not unit boundary, because of the 16-byte alignment.
Whether .text.hot and .text are arranged by translation unit or not
makes no difference.
With *(.text.hot) *(.text) you get HHTT, with *(.text.hot .text) you get
HTHT, but in both cases the individual chunks are already aligned to 16
bytes. If .text.hot _had_ different alignment requirements to .text, the
HHTT should actually give better packing in general, I think.
>
> It's bad for TLB locality too. Sadly with all the fine grained protection
> changes the 2MB coverage is eroding anyways, but this makes it even worse.
>
Yes, that could be true for .text.hot, depending on whether the hot
functions are called from all over the kernel (in which case putting
them together ought to be better) or mostly from regular text within the
unit in which they appeared (in which case it would be better together
with that code).