Re: [PATCH] perf tools: Fixup module symbol end address properly

From: Namhyung Kim
Date: Fri Feb 16 2024 - 00:20:15 EST


On Wed, Feb 14, 2024 at 2:14 AM Leo Yan <leo.yan@xxxxxxxxx> wrote:
>
> On Tue, Feb 13, 2024 at 10:48:53AM -0800, Namhyung Kim wrote:
> > Hi Leo,
> >
> > Thanks for your review!
> >
> > On Mon, Feb 12, 2024 at 7:40???PM Leo Yan <leo.yan@xxxxxxxxx> wrote:
> > >
> > > On Mon, Feb 12, 2024 at 03:33:22PM -0800, Namhyung Kim wrote:
> > > > I got a strange error on ARM to fail on processing FINISHED_ROUND
> > > > record. It turned out that it was failing in symbol__alloc_hist()
> > > > because the symbol size is too big.
> > > >
> > > > When a sample is captured on a specific BPF program, it failed. I've
> > > > added a debug code and found the end address of the symbol is from
> > > > the next module which is placed far way.
> > > >
> > > > ffff800008795778-ffff80000879d6d8: bpf_prog_1bac53b8aac4bc58_netcg_sock [bpf]
> > > > ffff80000879d6d8-ffff80000ad656b4: bpf_prog_76867454b5944e15_netcg_getsockopt [bpf]
> > > > ffff80000ad656b4-ffffd69b7af74048: bpf_prog_1d50286d2eb1be85_hn_egress [bpf] <---------- here
> > > > ffffd69b7af74048-ffffd69b7af74048: $x.5 [sha3_generic]
> > > > ffffd69b7af74048-ffffd69b7af740b8: crypto_sha3_init [sha3_generic]
> > > > ffffd69b7af740b8-ffffd69b7af741e0: crypto_sha3_update [sha3_generic]
> > > >
> > > > The logic in symbols__fixup_end() just uses curr->start to update the
> > > > prev->end. But in this case, it won't work as it's too different.
> > > >
> > > > I think ARM has a different kernel memory layout for modules and BPF
> > > > than on x86. Actually there's a logic to handle kernel and module
> > > > boundary. Let's do the same for symbols between different modules.
> > >
> > > Even Arm32 and Arm64 kernel have different memory layout for modules
> > > and kernel image.
> > >
> > > eBPF program (JITed) should be allocated from the vmalloc region, for
> > > Arm64, see bpf_jit_alloc_exec() in arch/arm64/net/bpf_jit_comp.c.
> >
> > Ok, so chances are they can fall out far away right?
>
> Yes, this is my understanding.
>
> > > > Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
> > > > ---
> > > > tools/perf/util/symbol.c | 21 +++++++++++++++++++--
> > > > 1 file changed, 19 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> > > > index 35975189999b..9ebdb8e13c0b 100644
> > > > --- a/tools/perf/util/symbol.c
> > > > +++ b/tools/perf/util/symbol.c
> > > > @@ -248,14 +248,31 @@ void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms)
> > > > * segment is very big. Therefore do not fill this gap and do
> > > > * not assign it to the kernel dso map (kallsyms).
> > > > *
> > > > + * Also BPF code can be allocated separately from text segments
> > > > + * and modules. So the last entry in a module should not fill
> > > > + * the gap too.
> > > > + *
> > > > * In kallsyms, it determines module symbols using '[' character
> > > > * like in:
> > > > * ffffffffc1937000 T hdmi_driver_init [snd_hda_codec_hdmi]
> > > > */
> > > > if (prev->end == prev->start) {
> > > > + const char *prev_mod;
> > > > + const char *curr_mod;
> > > > +
> > > > + if (!is_kallsyms) {
> > > > + prev->end = curr->start;
> > > > + continue;
> > > > + }
> > > > +
> > > > + prev_mod = strchr(prev->name, '[');
> > > > + curr_mod = strchr(curr->name, '[');
> > > > +
> > > > /* Last kernel/module symbol mapped to end of page */
> > > > - if (is_kallsyms && (!strchr(prev->name, '[') !=
> > > > - !strchr(curr->name, '[')))
> > > > + if (!prev_mod != !curr_mod)
> > > > + prev->end = roundup(prev->end + 4096, 4096);
> > > > + /* Last symbol in the previous module */
> > > > + else if (prev_mod && strcmp(prev_mod, curr_mod))
> > >
> > > Should two consecutive moudles fall into this case? I think we need to assign
> > > 'prev->end = curr->start' for two two consecutive moudles.
> >
> > Yeah I thought about that case but I believe they would be on
> > separate pages (hopefully there's a page gap between them).
> > So I think it should not overlap. But if you really care we can
> > check it explicitly like this:
> >
> > prev->end = min(roundup(...), curr->start);
>
> I am not concerned that to assign a bigger end value for the 'prev'
> symbol. With an exaggerate end region, it will not cause any
> difficulty for parsing symbols.

Right, but my problem was not in parsing. It failed to allocate
memory for the symbol because it's too big.

> On the other hand, I am a bit concern
> for a big function (e.g. its code size > 4KiB), we might fail to find
> symbols in this case with the change above.

Yes, it's another problem. But it cannot know the exact size
so it just assumes it fits in a page.

>
> > > If so, we should use a specific checking for eBPF program, e.g.:
> > >
> > > else if (prev_mod && strcmp(prev_mod, curr_mod) &&
> > > (!strcmp(prev->name, "bpf") ||
> > > !strcmp(curr->name, "bpf")))
> >
> > I suspect it can happen on any module boundary so better
> > to handle it in a more general way.
>
> I don't want to introduce over complexity at here. We can apply
> current patch as it is.

Good, can I get your Reviewed-by then? :)

>
> A side topic, when I saw the code is hard coded for 4096 as the page
> size, this is not always true on Arm64 (the page size can be 4KiB,
> 16KiB or 64KiB). We need to consider to extend the environment for
> recording the system's page size.

Sounds good. But until then, 4K would be the reasonable choice.

Thanks,
Namhyung

> >
> > >
> > > > prev->end = roundup(prev->end + 4096, 4096);
> > > > else
> > > > prev->end = curr->start;
> > > > --
> > > > 2.43.0.687.g38aa6559b0-goog
> > > >