Re: [PATCH] tracing: Fix trace_pipe_raw read panic
From: Chunyu Hu
Date: Wed Aug 02 2017 - 07:03:58 EST
Thanks for the comments and review!
On 2 August 2017 at 00:22, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> Hmm, I applied this without taking too much of a look at it. I'm going
> to modified it a little (sorry, I'm about to take another trip and need
> to get this tested before I leave).
>
> Comments below...
>
>
> On Mon, 24 Jul 2017 15:21:06 +0800
> Chunyu Hu <chuhu@xxxxxxxxxx> wrote:
>
>> per_cpu trace directories and files are created for all possible cpus,
>> but only the cpus which have ever been on-lined have their own per cpu
>> ring buffer (allocated by cpuhp threads). While trace_buffers_open, the
>> open handler for trace file 'trace_pipe_raw' is always trying to access
>> field of ring_buffer_per_cpu, and would panic with the NULL pointer.
>>
>> Align the behavior of trace_pipe_raw with trace_pipe, that returns -NODEV
>> when openning it if that cpu does not have trace ring buffer.
>>
>> Reproduce:
>> cat /sys/kernel/debug/tracing/per_cpu/cpu31/trace_pipe_raw
>> (cpu31 is never on-lined, this is a 16 cores x86_64 box)
>>
>> Tested with:
>> 1) boot with maxcpus=14, read trace_pipe_raw of cpu15.
>> Got -NODEV.
>> 2) oneline cpu15, read trace_pipe_raw of cpu15.
>> Get the raw trace data.
>>
>> Call trace:
>> [ 5760.950995] RIP: 0010:ring_buffer_alloc_read_page+0x32/0xe0
>> [ 5760.961678] tracing_buffers_read+0x1f6/0x230
>> [ 5760.962695] __vfs_read+0x37/0x160
>> [ 5760.963498] ? __vfs_read+0x5/0x160
>> [ 5760.964339] ? security_file_permission+0x9d/0xc0
>> [ 5760.965451] ? __vfs_read+0x5/0x160
>> [ 5760.966280] vfs_read+0x8c/0x130
>> [ 5760.967070] SyS_read+0x55/0xc0
>> [ 5760.967779] do_syscall_64+0x67/0x150
>> [ 5760.968687] entry_SYSCALL64_slow_path+0x25/0x25
>>
>> Signed-off-by: Chunyu Hu <chuhu@xxxxxxxxxx>
>> ---
>> kernel/trace/ring_buffer.c | 5 +++++
>> kernel/trace/trace.c | 14 +++++++++++++-
>> 2 files changed, 18 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
>> index 4ae268e..66c109e 100644
>> --- a/kernel/trace/ring_buffer.c
>> +++ b/kernel/trace/ring_buffer.c
>> @@ -3002,6 +3002,11 @@ int ring_buffer_write(struct ring_buffer *buffer,
>> }
>> EXPORT_SYMBOL_GPL(ring_buffer_write);
>>
>> +bool rb_per_cpu_allocated(struct ring_buffer *buffer, int cpu)
>> +{
>> + return !!cpumask_test_cpu(cpu, buffer->cpumask);
>
> The "!!" is not needed. The compiler will convert it for you.
OK. After researching, agreed.
>
>> +}
>> +
>> static bool rb_per_cpu_empty(struct ring_buffer_per_cpu *cpu_buffer)
>> {
>> struct buffer_page *reader = cpu_buffer->reader_page;
>> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
>> index 42b9355..508a1ca 100644
>> --- a/kernel/trace/trace.c
>> +++ b/kernel/trace/trace.c
>> @@ -6542,11 +6542,16 @@ static int snapshot_raw_open(struct inode *inode, struct file *filp)
>>
>> #endif /* CONFIG_TRACER_SNAPSHOT */
>>
>> +extern bool rb_per_cpu_allocated(struct ring_buffer *buffer, int cpu);
>> +
>
> This must be in a header file.
OK. I hope that won't change symver.
>
>
>> static int tracing_buffers_open(struct inode *inode, struct file *filp)
>> {
>> struct trace_array *tr = inode->i_private;
>> + struct trace_buffer *tb = &tr->trace_buffer;
>
> Other places use the variable "trace_buf" for this. Need to stay
> consistent.
Agreed.
>
>> + struct ring_buffer *buffer = tb->buffer;
>
> Actually, why even have that variable? It's not used anywhere else.
Agreed.
>
>> struct ftrace_buffer_info *info;
>> int ret;
>> + int cpu_file;
>>
>> if (tracing_disabled)
>> return -ENODEV;
>> @@ -6554,6 +6559,13 @@ static int tracing_buffers_open(struct inode *inode, struct file *filp)
>> if (trace_array_get(tr) < 0)
>> return -ENODEV;
>
> Not to mention, the tr can not be dereferenced until after the above
> get.
Ah, my bad. I missed this.
>
> I'll make the changes, and I'll keep you as author.
Thanks.
>
> Thanks!
>
> -- Steve
>
>>
>> + cpu_file = tracing_get_cpu(inode);
>> +
>> + /* Make sure, ring buffer for this cpu is allocated. */
>> + if (cpu_file != RING_BUFFER_ALL_CPUS &&
>> + !rb_per_cpu_allocated(buffer, cpu_file))
>> + return -ENODEV;
>> +
>> info = kzalloc(sizeof(*info), GFP_KERNEL);
>> if (!info) {
>> trace_array_put(tr);
>> @@ -6563,7 +6575,7 @@ static int tracing_buffers_open(struct inode *inode, struct file *filp)
>> mutex_lock(&trace_types_lock);
>>
>> info->iter.tr = tr;
>> - info->iter.cpu_file = tracing_get_cpu(inode);
>> + info->iter.cpu_file = cpu_file;
>> info->iter.trace = tr->current_trace;
>> info->iter.trace_buffer = &tr->trace_buffer;
>> info->spare = NULL;
>