Re: [PATCH v5] tracing: Support to dump instance traces by ftrace_dump_on_oops

From: Huang Yiwei
Date: Thu Feb 29 2024 - 04:12:27 EST




On 2/27/2024 9:47 AM, Steven Rostedt wrote:
On Thu, 8 Feb 2024 21:18:14 +0800
Huang Yiwei <quic_hyiwei@xxxxxxxxxxx> wrote:

Currently ftrace only dumps the global trace buffer on an OOPs. For
debugging a production usecase, instance trace will be helpful to
check specific problems since global trace buffer may be used for
other purposes.

This patch extend the ftrace_dump_on_oops parameter to dump a specific
or multiple trace instances:

- ftrace_dump_on_oops=0: as before -- don't dump
- ftrace_dump_on_oops[=1]: as before -- dump the global trace buffer
on all CPUs
- ftrace_dump_on_oops=2 or =orig_cpu: as before -- dump the global
trace buffer on CPU that triggered the oops
- ftrace_dump_on_oops=<instance_name>: new behavior -- dump the
tracing instance matching <instance_name>
- ftrace_dump_on_oops[=2/orig_cpu],<instance1_name>[=2/orig_cpu],
<instrance2_name>[=2/orig_cpu]: new behavior -- dump the global trace
buffer and multiple instance buffer on all CPUs, or only dump on CPU
that triggered the oops if =2 or =orig_cpu is given

So we need to add that the syntax is:

ftrace_dump_on_oops[=[<0|1|2|orig_cpu>,][<instance_name>[=<1|2|orig_cpu>][,...]]

Yeah, this is much more clear, will update the commit message and kernel docs in new patchset.

Also, the sysctl node can handle the input accordingly.

Cc: Ross Zwisler <zwisler@xxxxxxxxxx>
Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
Signed-off-by: Huang Yiwei <quic_hyiwei@xxxxxxxxxxx>
---
.../admin-guide/kernel-parameters.txt | 26 ++-
Documentation/admin-guide/sysctl/kernel.rst | 30 +++-
include/linux/ftrace.h | 4 +-
include/linux/kernel.h | 1 +
kernel/sysctl.c | 4 +-
kernel/trace/trace.c | 156 +++++++++++++-----
kernel/trace/trace_selftest.c | 2 +-
7 files changed, 168 insertions(+), 55 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 31b3a25680d0..3d6ea8e80c2f 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1561,12 +1561,28 @@
The above will cause the "foo" tracing instance to trigger
a snapshot at the end of boot up.
- ftrace_dump_on_oops[=orig_cpu]
+ ftrace_dump_on_oops[=2(orig_cpu) | =<instance>][,<instance> |
+ ,<instance>=2(orig_cpu)]
[FTRACE] will dump the trace buffers on oops.
- If no parameter is passed, ftrace will dump
- buffers of all CPUs, but if you pass orig_cpu, it will
- dump only the buffer of the CPU that triggered the
- oops.
+ If no parameter is passed, ftrace will dump global
+ buffers of all CPUs, if you pass 2 or orig_cpu, it
+ will dump only the buffer of the CPU that triggered
+ the oops, or the specific instance will be dumped if
+ its name is passed. Multiple instance dump is also
+ supported, and instances are separated by commas. Each
+ instance supports only dump on CPU that triggered the
+ oops by passing 2 or orig_cpu to it.
+
+ ftrace_dump_on_oops=foo=orig_cpu
+
+ The above will dump only the buffer of "foo" instance
+ on CPU that triggered the oops.
+
+ ftrace_dump_on_oops,foo,bar=orig_cpu

I believe the above is incorrect. It should be:

ftrace_dump_on_oops=foo,bar=orig_cpu

And you can add here as well:

ftrace_dump_on_oops[=[<0|1|2|orig_cpu>,][<instance_name>[=<1|2|orig_cpu>][,...]]


Thanks,

--Steve

The explanation is below, I think it's correct?
- "ftrace_dump_on_oops," means global buffer on all CPUs
- "foo," means foo instance on all CPUs
- "bar=orig_cpu" means bar instance on CPU that triggered the oops.

I'm trying to make the example to cover more possibilities.

Regards,
Huang Yiwei
+
+ The above will dump global buffer on all CPUs, the
+ buffer of "foo" instance on all CPUs and the buffer
+ of "bar" instance on CPU that triggered the oops.
ftrace_filter=[function-list]