Re: [PATCH v3 1/2] perf symbol: Correct address for bss symbols

From: Ian Rogers
Date: Sat Jul 30 2022 - 11:21:33 EST


On Sat, Jul 30, 2022 at 2:38 AM Leo Yan <leo.yan@xxxxxxxxxx> wrote:
>
> Hi Ian,
>
> On Fri, Jul 29, 2022 at 10:13:04PM -0700, Ian Rogers wrote:
>
> [...]
>
> > I am seeing a problem with this patch with jvmti. To repro:
> >
> > 1) download a Java workload dacapo-9.12-MR1-bach.jar from
> > https://sourceforge.net/projects/dacapobench/
> > 2) build perf such as "make -C tools/perf O=/tmp/perf NO_LIBBFD=1" it
> > should detect Java and create /tmp/perf/libperf-jvmti.so
> > 3) run perf with the jvmti agent:
> > /tmp/perf/perf record -k 1 java -agentpath:/tmp/perf/libperf-jvmti.so
> > -jar dacapo-9.12-MR1-bach.jar -n 10 fop
> > 4) run perf inject:
> > /tmp/perf/perf inject -i perf.data -o perf-injected.data -j
> > 5) run perf report
> > /tmp/perf/perf report -i perf-injected.data | grep org.apache.fop
> >
> > With this patch reverted I see lots of symbols like:
> > 0.00% java jitted-388040-4656.so [.]
> > org.apache.fop.fo.FObj.bind(org.apache.fop.fo.PropertyList)
> >
> > With the patch I see lots of:
> > dso__load_sym_internal: failed to find program header for symbol:
> > Lorg/apache/fop/fo/FObj;bind(Lorg/apache/fop/fo/PropertyList;)V
> > st_value: 0x40
>
> Thanks for sharing the steps, I can reproduce the issue.
>
> I tried to add more logs to dump and hope can find specific pattern for
> these symbols, one thing I observed that if a symbol fails to find
> program header, it has the same values for st_value, shdr.sh_addr and
> shdr.sh_offset: all of them are 0x40. So that means if with you
> proposed change in below, then we will get the file address is:
>
> file_addr = st_value - shdr.sh_addr + shdr.sh_offset = 0x40
>
> Seems to me this is not reasonable: perf tries to add many symbols
> with the same file address 0x40.
>
> > Combining the old and new behaviors fixes the issue for me, wdyt?
>
> So far we don't answer a question is what's the purpose for these JAVA
> symbols. I checked these symbols and concluded as:
>
> - They are not label, this is because sym.st_info is 0x2, so its
> symbol type is STT_FUNC;
> - They are from ".text" section;
> - Symbol visibility is STV_DEFAULT;
> - Symbol's section index number is 0x1, which is different from some
> special sections (STV_DEFAULT/SHN_COMMON/SHN_UNDEF/SHN_XINDEX).
>
> This is a rough summary, these symbols are likewise the normal function
> symbols, but they have special st_value (0x40) and has no matched the
> program header for them.
>
> If we rollback to use old offsets to calculate the symbol file address,
> it still is incorrect.
>
> I list all relevant symbols in: https://termbin.com/s0fb, for a reliable
> fixing, could anyone with java experience shed some lights for handling
> the symbols?
>
> On the other hand, I can accept to simply change pr_warning() to
> pr_debug4() to avoid warning flood, the log still can help us to find
> potential symbol parsing issue, so far they are not false-positive
> reporting.

Thanks, I suspect the ELF that the Java agent has created isn't good.
The Java agent is part of perf as and so is the ELF file generation
code:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/genelf.c?h=perf/core#n367
I took a quick look but most of the logic is in libelf - it seems less
obvious the bug would be there rather than perf. Could you take a
look? Ideally there'd be a quick fix that keeps all the benefits of
your change and the jvmti code working.

Thanks,
Ian


> Thanks,
> Leo
>
> > ```
> > --- a/tools/perf/util/symbol-elf.c
> > +++ b/tools/perf/util/symbol-elf.c
> > @@ -1305,16 +1305,21 @@ dso__load_sym_internal(struct dso *dso, struct
> > map *map, struct symsrc *syms
> > _ss,
> >
> > if (elf_read_program_header(syms_ss->elf,
> > (u64)sym.st_value, &phdr)) {
> > - pr_warning("%s: failed to find program
> > header for "
> > + pr_debug4("%s: failed to find program
> > header for "
> > "symbol: %s st_value: %#" PRIx64 "\n",
> > __func__, elf_name,
> > (u64)sym.st_value);
> > - continue;
> > + pr_debug4("%s: adjusting symbol:
> > st_value: %#" PRIx64 " "
> > + "sh_addr: %#" PRIx64 "
> > sh_offset: %#" PRIx64 "\n",
> > + __func__, (u64)sym.st_value,
> > (u64)shdr.sh_addr,
> > + (u64)shdr.sh_offset);
> > + sym.st_value -= shdr.sh_addr - shdr.sh_offset;
> > + } else {
> > + pr_debug4("%s: adjusting symbol:
> > st_value: %#" PRIx64 " "
> > + "p_vaddr: %#" PRIx64 "
> > p_offset: %#" PRIx64 "\n",
> > + __func__, (u64)sym.st_value,
> > (u64)phdr.p_vaddr,
> > + (u64)phdr.p_offset);
> > + sym.st_value -= phdr.p_vaddr - phdr.p_offset;
> > }
> > - pr_debug4("%s: adjusting symbol: st_value: %#"
> > PRIx64 " "
> > - "p_vaddr: %#" PRIx64 " p_offset: %#"
> > PRIx64 "\n",
> > - __func__, (u64)sym.st_value,
> > (u64)phdr.p_vaddr,
> > - (u64)phdr.p_offset);
> > - sym.st_value -= phdr.p_vaddr - phdr.p_offset;
> > }
> >
> > demangled = demangle_sym(dso, kmodule, elf_name);
> > ```
> >
> > Thanks,
> > Ian
> >
> > >
> > > demangled = demangle_sym(dso, kmodule, elf_name);
> > > --
> > > 2.25.1
> > >