RE: [RFC PATCH 06/11] x86: make sure _etext includes function sections

From: David Laight
Date: Thu Feb 06 2020 - 11:27:53 EST


From: Jann Horn
> Sent: 06 February 2020 13:16
...
> > I cannot find evidence for
> > what function start alignment should be.
>
> There is no architecturally required alignment for functions, but
> Intel's Optimization Manual
> (<https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-
> optimization-manual.pdf>)
> recommends in section 3.4.1.5, "Code Alignment":
>
> | Assembly/Compiler Coding Rule 12. (M impact, H generality)
> | All branch targets should be 16-byte aligned.
>
> AFAIK this is recommended because, as documented in section 2.3.2.1,
> "Legacy Decode Pipeline" (describing the frontend of Sandy Bridge, and
> used as the base for newer microarchitectures):
>
> | An instruction fetch is a 16-byte aligned lookup through the ITLB
> and into the instruction cache.
> | The instruction cache can deliver every cycle 16 bytes to the
> instruction pre-decoder.
>
> AFAIK this means that if a branch ends close to the end of a 16-byte
> block, the frontend is less efficient because it may have to run two
> instruction fetches before the first instruction can even be decoded.

See also The microarchitecture of Intel, AMD and VIA CPUs from www.agner.org/optimize

My suspicion is that reducing the cache size (so more code fits in)
will almost always be a win over aligning branch targets and entry points.
If the alignment of a function matters then there are probably other
changes to that bit of code that will give a larger benefit.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)