Re: [PATCH] trace: Fix race in trace_open and buffer resize call

From: Steven Rostedt
Date: Tue Sep 15 2020 - 14:34:45 EST


On Tue, 15 Sep 2020 22:53:32 +0530
Gaurav Kohli <gkohli@xxxxxxxxxxxxxx> wrote:

> On 9/15/2020 6:53 PM, Steven Rostedt wrote:
> > On Tue, 15 Sep 2020 10:38:03 +0530
> > Gaurav Kohli <gkohli@xxxxxxxxxxxxxx> wrote:
> >
> >>
> >> >>> +void ring_buffer_mutex_release(struct trace_buffer *buffer)
> >> >>> +{
> >> >>> + mutex_unlock(&buffer->mutex);
> >> >>> +}
> >> >>> +EXPORT_SYMBOL_GPL(ring_buffer_mutex_release);
> >> >
> >> > I really do not like to export these.
> >> >
> >>
> >> Actually available reader lock is not helping
> >> here(&cpu_buffer->reader_lock), So i took ring buffer mutex lock to
> >> resolve this(this came on 4.19/5.4), in latest tip it is trace buffer
> >> lock. Due to this i have exported api.
> >
> > I'm saying, why don't you take the buffer->mutex in the
> > ring_buffer_reset_online_cpus() function? And remove all the protection in
> > tracing_reset_online_cpus()?
>
> Yes, got your point. then we can avoid export. Actually we are seeing
> issue in older kernel like 4.19/4.14/5.4 and there below patch was not
> present in stable branches:
>
> ommit b23d7a5f4a07 ("ring-buffer: speed up buffer resets by
> > avoiding synchronize_rcu for each CPU")

If you mark this patch for stable, you can add:

Depends-on: b23d7a5f4a07 ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU")

>
> Actually i have also thought to take mutex lock in ring_buffer_reset_cpu
> while doing individual cpu reset, but this could cause another problem:

Hmm, I think we should also take the buffer lock in the reset_cpu() call
too, and modify tracing_reset_cpu() the same way.

>
> Different cpu buffer may have different state, so i have taken lock in
> tracing_reset_online_cpus.

Why would different states be an issue in synchronizing?

-- Steve

> >
> > void tracing_reset_online_cpus(struct array_buffer *buf)
> > {
> > struct trace_buffer *buffer = buf->buffer;
> >
> > if (!buffer)
> > return;
> >
> > buf->time_start = buffer_ftrace_now(buf, buf->cpu);
> >
> > ring_buffer_reset_online_cpus(buffer);
> > }
> >
> > The reset_online_cpus() is already doing the synchronization, we don't need
> > to do it twice.
> >
> > I believe commit b23d7a5f4a07 ("ring-buffer: speed up buffer resets by
> > avoiding synchronize_rcu for each CPU") made the synchronization in
> > tracing_reset_online_cpus() obsolete.
> >
> > -- Steve
> >
>
> Yes, with above patch no need to take lock in tracing_reset_online_cpus.