Re: [PATCH v5 0/4] tracing: improve symbolic printing

From: Johannes Berg
Date: Mon Aug 26 2024 - 14:24:43 EST


Hi Steven,

> I finally got around to testing your patches.
>
> I did the following:
>
> # cat /sys/kernel/tracing/events/*/*/format
>
> and hit this:
>
> BUG: unable to handle page fault for address: ffffffff8e6333d0

Ugh. That's ... unfortunate.

I couldn't reproduce this so far, do you happen to have the .config and
qemu command line perhaps?


> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 183c40067 P4D 183c40067 PUD 183c41063 PMD 1003ef063 PTE 800ffffe7b9cc062
> Oops: Oops: 0000 [#1] PREEMPT SMP PTI
> CPU: 7 UID: 0 PID: 893 Comm: cat Not tainted 6.11.0-rc4-test-00004-g4ce2836f008b #56 68afcee1248519f8b3b088836c40746e4a6b69d3

I guess that's just my four patches; reproducing that (rc4 + the
patches, seems there was a small conflict in skb tracing) ...

> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> RIP: 0010:f_show (kernel/trace/trace_events.c:1601 kernel/trace/trace_events.c:1651 kernel/trace/trace_events.c:1689)

means this was


for (i = 0; i < n_sym_defs; i++) {
unsigned int sym_len;

if (!sym_defs[i]) // <--- HERE
continue;

which seems really strange to me.

Sadly the

> Code: [...]

was compiled so we don't see 'i' here, and the stop value is on the
stack...

> All code
> ========
> 0: 33 63 8e xor -0x72(%rbx),%esp
> 3: 48 2d d0 33 63 8e sub $0xffffffff8e6333d0,%rax

the big constant here must be __start_ftrace_sym_defs, being subtracted
from an already loaded __stop_ftrace_sym_defs in %rax ...

> 9: 48 c1 f8 03 sar $0x3,%rax

also part of the size calculation (n_sym_defs) - we have 8 byte pointers
in each entry, so >>= 3 makes sense.

Given RBX is 0xffffffff8e6333d0 as well, that means
__start_ftrace_sym_defs isn't a valid pointer? But then what was
__stop_ftrace_sym_defs? And why would the section be non-empty but
invalid to dereference?

> d: 85 c0 test %eax,%eax
> f: 74 67 je 0x78

Anyway this aborts if there are 0 entries so we get non-zero in
%eax/%rax ...

I've actually tried now with an empty section, and that also works OK.

Is there something special about your build here?

johannes