Re: [PATCH] tracing: Fix trace_pipe_raw read panic

From: Steven Rostedt
Date: Tue Aug 01 2017 - 12:23:08 EST



Hmm, I applied this without taking too much of a look at it. I'm going
to modified it a little (sorry, I'm about to take another trip and need
to get this tested before I leave).

Comments below...


On Mon, 24 Jul 2017 15:21:06 +0800
Chunyu Hu <chuhu@xxxxxxxxxx> wrote:

> per_cpu trace directories and files are created for all possible cpus,
> but only the cpus which have ever been on-lined have their own per cpu
> ring buffer (allocated by cpuhp threads). While trace_buffers_open, the
> open handler for trace file 'trace_pipe_raw' is always trying to access
> field of ring_buffer_per_cpu, and would panic with the NULL pointer.
>
> Align the behavior of trace_pipe_raw with trace_pipe, that returns -NODEV
> when openning it if that cpu does not have trace ring buffer.
>
> Reproduce:
> cat /sys/kernel/debug/tracing/per_cpu/cpu31/trace_pipe_raw
> (cpu31 is never on-lined, this is a 16 cores x86_64 box)
>
> Tested with:
> 1) boot with maxcpus=14, read trace_pipe_raw of cpu15.
> Got -NODEV.
> 2) oneline cpu15, read trace_pipe_raw of cpu15.
> Get the raw trace data.
>
> Call trace:
> [ 5760.950995] RIP: 0010:ring_buffer_alloc_read_page+0x32/0xe0
> [ 5760.961678] tracing_buffers_read+0x1f6/0x230
> [ 5760.962695] __vfs_read+0x37/0x160
> [ 5760.963498] ? __vfs_read+0x5/0x160
> [ 5760.964339] ? security_file_permission+0x9d/0xc0
> [ 5760.965451] ? __vfs_read+0x5/0x160
> [ 5760.966280] vfs_read+0x8c/0x130
> [ 5760.967070] SyS_read+0x55/0xc0
> [ 5760.967779] do_syscall_64+0x67/0x150
> [ 5760.968687] entry_SYSCALL64_slow_path+0x25/0x25
>
> Signed-off-by: Chunyu Hu <chuhu@xxxxxxxxxx>
> ---
> kernel/trace/ring_buffer.c | 5 +++++
> kernel/trace/trace.c | 14 +++++++++++++-
> 2 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 4ae268e..66c109e 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -3002,6 +3002,11 @@ int ring_buffer_write(struct ring_buffer *buffer,
> }
> EXPORT_SYMBOL_GPL(ring_buffer_write);
>
> +bool rb_per_cpu_allocated(struct ring_buffer *buffer, int cpu)
> +{
> + return !!cpumask_test_cpu(cpu, buffer->cpumask);

The "!!" is not needed. The compiler will convert it for you.

> +}
> +
> static bool rb_per_cpu_empty(struct ring_buffer_per_cpu *cpu_buffer)
> {
> struct buffer_page *reader = cpu_buffer->reader_page;
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 42b9355..508a1ca 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -6542,11 +6542,16 @@ static int snapshot_raw_open(struct inode *inode, struct file *filp)
>
> #endif /* CONFIG_TRACER_SNAPSHOT */
>
> +extern bool rb_per_cpu_allocated(struct ring_buffer *buffer, int cpu);
> +

This must be in a header file.


> static int tracing_buffers_open(struct inode *inode, struct file *filp)
> {
> struct trace_array *tr = inode->i_private;
> + struct trace_buffer *tb = &tr->trace_buffer;

Other places use the variable "trace_buf" for this. Need to stay
consistent.

> + struct ring_buffer *buffer = tb->buffer;

Actually, why even have that variable? It's not used anywhere else.

> struct ftrace_buffer_info *info;
> int ret;
> + int cpu_file;
>
> if (tracing_disabled)
> return -ENODEV;
> @@ -6554,6 +6559,13 @@ static int tracing_buffers_open(struct inode *inode, struct file *filp)
> if (trace_array_get(tr) < 0)
> return -ENODEV;

Not to mention, the tr can not be dereferenced until after the above
get.

I'll make the changes, and I'll keep you as author.

Thanks!

-- Steve

>
> + cpu_file = tracing_get_cpu(inode);
> +
> + /* Make sure, ring buffer for this cpu is allocated. */
> + if (cpu_file != RING_BUFFER_ALL_CPUS &&
> + !rb_per_cpu_allocated(buffer, cpu_file))
> + return -ENODEV;
> +
> info = kzalloc(sizeof(*info), GFP_KERNEL);
> if (!info) {
> trace_array_put(tr);
> @@ -6563,7 +6575,7 @@ static int tracing_buffers_open(struct inode *inode, struct file *filp)
> mutex_lock(&trace_types_lock);
>
> info->iter.tr = tr;
> - info->iter.cpu_file = tracing_get_cpu(inode);
> + info->iter.cpu_file = cpu_file;
> info->iter.trace = tr->current_trace;
> info->iter.trace_buffer = &tr->trace_buffer;
> info->spare = NULL;