Re: [PATCH v2 -tip] x86/percpu: Use C for arch_raw_cpu_ptr()

From: Josh Poimboeuf
Date: Wed Oct 11 2023 - 21:35:43 EST


On Wed, Oct 11, 2023 at 04:15:15PM -0700, H. Peter Anvin wrote:
> On 10/11/23 15:37, Ingo Molnar wrote:
> >
> > * Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > > > The only drawback is a larger binary size:
> > > >
> > > > text data bss dec hex filename
> > > > 25546594 4387686 808452 30742732 1d518cc vmlinux-new.o
> > > > 25515256 4387814 808452 30711522 1d49ee2 vmlinux-old.o
> > > >
> > > > that increases by 31k (0.123%), probably due to 1578 rdgsbase alternatives.
> > >
> > > I'm actually surprised that it increases the text size. The 'rdgsbase'
> > > instruction should be smaller than a 'mov %gs', so I would have
> > > expected the *data* size to increase due to the alternatives tables,
> > > but not the text size.
> > >
> > > [ Looks around ]
> > >
> > > Oh. It's because we put the altinstructions into the text section.
> > > That's kind of silly, but whatever.
> >
> > Yeah, we should probably move .altinstructions from init-text to .init.data
> > or so? Contains a bunch of other sections too that don't get executed
> > directly ... and in fact has some non-code data structures too, such as ...
> > ".apicdrivers". :-/
> >
> > I suspect people put all that into .text because it was the easiest place
> > to modify in the x86 linker script, and linker scripts are arguably scary.
> >
>
> Well, it's more than that; "size" considers all non-writable sections to be
> "text".

Indeed, I added a printf to "size", it shows that all the following
sections are "text":

.text
.pci_fixup
.tracedata
__ksymtab
__ksymtab_gpl
__ksymtab_strings
__init_rodata
__param
__ex_table
.notes
.orc_header
.orc_unwind_ip
.orc_unwind
.init.text
.altinstr_aux
.x86_cpu_dev.init
.parainstructions
.retpoline_sites
.return_sites
.call_sites
.altinstructions
.altinstr_replacement
.exit.text
.smp_locks

I can't fathom why it doesn't just filter based on the EXECINSTR section
flag.

"size" is probably worse than useless, as many of these sections can
change size rather arbitrarily, especially .orc_* and .*_sites.

I can't help but wonder how many hasty optimizations have been made over
the years based on the sketchy output of this tool.

It should be trivial to replace the use of "size" with our own
"text_size" script which does what we want, e.g., filter on EXECINSTR.

Here are the current EXECINSTR sections:

~/git/binutils-gdb/binutils $ readelf -WS /tmp/vmlinux |grep X
[ 1] .text PROGBITS ffffffff81000000 200000 1200000 00 AX 0 0 4096
[21] .init.text PROGBITS ffffffff833b7000 27b7000 091b50 00 AX 0 0 16
[22] .altinstr_aux PROGBITS ffffffff83448b50 2848b50 00176a 00 AX 0 0 1
[30] .altinstr_replacement PROGBITS ffffffff8372661a 2b2661a 0028b9 00 AX 0 0 1
[32] .exit.text PROGBITS ffffffff83728f10 2b28f10 0030c7 00 AX 0 0 16

As Ingo mentioned, we could make .altinstr_replacement non-executable.
That confuses objtool, but I think we could remedy that pretty easily.

Though, another problem is that .text has a crazy amount of padding
which makes it always the same size, due to the SRSO alias mitigation
alignment linker magic. We should fix that somehow.

--
Josh