Re: [PATCH v2 00/16] Address some perf memory/data size issues

From: Ian Rogers
Date: Tue May 30 2023 - 10:45:28 EST


On Tue, May 30, 2023 at 12:59 AM Andi Kleen <ak@xxxxxxxxxxxxxxx> wrote:
>
> > BSS won't count toward file size, which the patches were primarily
> > going after - but checking the size numbers I have miscalculated from
> > reading size's output that I'm not familiar with. The numbers are
> > still improved, but I just see a 37kb saving, with 5kb more in
> > .rodata. Something but not much. .data.rel.ro is larger, which imo is
> > good, but those pages will still be dirtied so a mute point wrt file
> > size and memory overhead.
>
> The way perf is written (lots of separate code depending on a single high level
> switch) most pages probably won't be dirtied.

For data everything is relocated when perf is loaded. Setting a
breakpoint on main and then dumping smaps (edited for brevity) I see:
```
555555554000-5555555f8000 r--p 00000000 fe:01 32936368
/tmp/perf/perf
Size: 656 kB
Pss: 656 kB
Pss_Dirty: 0 kB
5555555f8000-555555828000 r-xp 000a4000 fe:01 32936368
/tmp/perf/perf
Size: 2240 kB
Pss: 32 kB
Pss_Dirty: 8 kB
555555828000-555555f23000 r--p 002d4000 fe:01 32936368
/tmp/perf/perf
Size: 7148 kB
Pss: 64 kB
Pss_Dirty: 0 kB
555555f23000-555555f6d000 r--p 009cf000 fe:01 32936368
/tmp/perf/perf
Size: 296 kB
Pss: 288 kB
Pss_Dirty: 288 kB
555555f6d000-555555f87000 rw-p 00a19000 fe:01 32936368
/tmp/perf/perf
Size: 104 kB
Pss: 104 kB
Pss_Dirty: 104 kB
```
These are roughly header, text, .rodata, .data.rel.ro, .data. So at
the point we enter main we have 392kB of dirty pages in .data.rel.ro
and .data.

For x86 a large contributor to the relocations comes from the insn-x86.c test:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/arch/x86/tests/insn-x86.c?h=perf-tools-next#n21
The test_data_32 and test_data_64 arrays are 75,024 bytes and 93,600
bytes respectively and are in .data.rel.ro, they account for nearly
40% of it.

In gdb at main entry:
```
(gdb) p test_data_32[0]
$1 = {data = "\017\061", '\000' <repeats 12 times>, expected_length =
2, expected_rel = 0,
expected_op_str = 0x555555866adc "", expected_branch_str = 0x555555866adc "",
asm_rep = 0x55555586fa2a "0f 31", ' ' <repeats 16 times>, "\trdtsc "}
```
you can see that all the strings in test_data_32 have been relocated
(even though we haven't run any part of perf yet) and are pointing to
data in .rodata. To avoid these relocations for the output of
jevents.py (pmu-events.c) all the strings are merged into a big string
and then the offsets within the string are stored - no relocations
means everything goes in the nice non-dirty .rodata. As the data in
the insn-x86.c test is also generated then a similar trick could be
performed. There is also the possibility to separate all the perf
builtins into libraries...

Thanks,
Ian

> >
> > For huge pages I thought it was correct that things are aligned by max
> > page size which I thought on x86-64 was 2MB, so I tried:
> > EXTRA_LDFLAGS="-z max-page-size=4096"
> > but it made no difference to anything, and with:
> > EXTRA_CFLAGS="-Wl,-z,max-page-size=4096"
> > EXTRA_CXXFLAGS="-Wl,-z,max-page-size=4096"
> > file size just got worse.
>
> The default alignment to 2MB was dropped in the GNU toolchain in 2018 or
> so.
>
> -Andi