perf annotate fails with "Invalid -1 error code"
From: Russell King - ARM Linux admin
Date: Mon Sep 30 2019 - 08:15:52 EST
Hi,
While using perf report on aarch64, I try to annotate
__arch_copy_to_user, and it fails with:
Error: Couldn't annotate __arch_copy_to_user: Internal error: Invalid -1 error code
which is not very helpful. Looking at the code, the error message
appended to the "Couldn't annotate ...:" comes from
symbol__strerror_disassemble(), which expects either an errno or
one of the special SYMBOL_ANNOTATE_ERRNO_* constants in its 3rd
argument.
symbol__tui_annotate() passes the 3rd argument as the return value
from symbol__annotate2(). symbol__annotate2() returns either zero or
-1. This calls symbol__annotate(), which returns -1 (which would
generally conflict with -EPERM), -errno, the return value of
arch->init, or the return value of symbol__disassemble().
This seems to be something of a mess - different places seem to use
different approaches to handling errors, and some don't bother
propagating the error code up.
The upshot is, the error message reported when trying to annotate
gives the user no clue why perf is unable to annotate, and you have
to resort to stracing perf in an attempt to find out - which also
isn't useful:
3431 pselect6(1, [0], NULL, NULL, NULL, NULL) = 1 (in [0])
3431 pselect6(5, [4], NULL, NULL, {tv_sec=10, tv_nsec=0}, NULL) = 1 (in [4], left {tv_sec=9, tv_nsec=999995480})
3431 read(4, "\r", 1) = 1
3431 uname({sysname="Linux", nodename="cex7", ...}) = 0
3431 openat(AT_FDCWD, "/usr/lib/aarch64-linux-gnu/gconv/gconv-modules.cache", O_RDONLY) = 26
3431 fstat(26, {st_mode=S_IFREG|0644, st_size=26404, ...}) = 0
3431 mmap(NULL, 26404, PROT_READ, MAP_SHARED, 26, 0) = 0x7fa1fd9000
3431 close(26) = 0
3431 futex(0x7fa172b830, FUTEX_WAKE_PRIVATE, 2147483647) = 0
3431 write(1, "\33[10;21H\33[37m\33[40m\342\224\214\342\224\200Error:\342\224"..., 522) = 522
3431 pselect6(1, [0], NULL, NULL, NULL, NULL <detached ...>
Which makes it rather difficult to know what is actually failing...
so the only way is to resort to gdb.
It seems that dso__disassemble_filename() is returning -10000, which
seems to be SYMBOL_ANNOTATE_ERRNO__NO_VMLINUX and as described above,
this is lost due to the lack of error code propagation.
Specifically, the failing statement is:
if (dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS &&
!dso__is_kcore(dso))
return SYMBOL_ANNOTATE_ERRNO__NO_VMLINUX;
Looking at "dso" shows:
kernel = DSO_TYPE_KERNEL,
symtab_type = DSO_BINARY_TYPE__KALLSYMS,
binary_type = DSO_BINARY_TYPE__KALLSYMS,
load_errno = DSO_LOAD_ERRNO__MISMATCHING_BUILDID,
name = 0x555588781c "/boot/vmlinux",
and we finally get to the reason - it's using the wrong vmlinux.
So, obvious solution (once the failure reason is known), give it
the correct vmlinux.
Should it really be necessary to resort to gdb to discover why perf
is failing?
It looks like this was introduced by ecda45bd6cfe ("perf annotate:
Introduce symbol__annotate2 method") which did this:
- err = symbol__annotate(sym, map, evsel, 0, &browser.arch);
+ err = symbol__annotate2(sym, map, evsel, &annotate_browser__opts, &browser.arch);
+int symbol__annotate2(struct symbol *sym, struct map *map, struct perf_evsel *evsel,
+ struct annotation_options *options, struct arch **parch)
+{
...
+ err = symbol__annotate(sym, map, evsel, 0, parch);
+ if (err)
+ goto out_free_offsets;
...
+out_free_offsets:
+ zfree(¬es->offsets);
+ return -1;
+}
introducing this problem by the "return -1" disease.
So, given that this function's return value is used as an error code
in the way I've described above, should this function also be fixed
to return ENOMEM when the zalloc fails, as well as propagating the
return value from symbol__annotate() ?
I haven't yet checked to see if there's other places that call this
function but now rely on it returning -1... but I'd like to lodge a
plea that perf gets some consistency wrt how errors are passed and
propagated from one function to another.
Thanks.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up