Re: [PATCH -tip] Re: [PATCH -tip 0/2] Temporary RCU fixes fornotrace and hotplug CPU

From: Paul E. McKenney
Date: Tue Aug 25 2009 - 21:31:44 EST


On Wed, Aug 26, 2009 at 08:53:20AM +0800, Lai Jiangshan wrote:
> Paul E. McKenney wrote:
> > On Tue, Aug 25, 2009 at 02:02:33PM -0400, Mathieu Desnoyers wrote:
> >> * Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
> >>> On Tue, Aug 25, 2009 at 12:25:49PM -0400, Mathieu Desnoyers wrote:
> >>>> * Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
> >>>>> On Tue, Aug 25, 2009 at 10:00:47AM +0200, Ingo Molnar wrote:
> >>>>>> btw., i'm still seeing crashes with the latest RCU bits:
> >>>>>>
> >>>>>> [ 20.621740] Testing event sys_enter_futex: OK
> >>>>>> [ 20.629738] Testing event sys_exit_futex: OK
> >>>>>> [ 20.637737] Testing event lock_acquire: [reboot]
> >>>>>>
> >>>>>> Possibly due to infinite recursion as well. Config attached.
> >>>>> Color me confused...
> >>>>>
> >>>>> Unless someone has a better idea, I will send in a patch that adds
> >>>>> "notrace" to every RCU API member used by any file in the kernel
> >>>>> that has "trace" in its name (excluding ptrace.c and rcutree_trace.c,
> >>>>> of course). This list is as follows:
> >>>>>
> >>>>> call_rcu()
> >>>>> call_rcu_sched()
> >>>>> rcu_read_lock()
> >>>>> rcu_read_unlock()
> >>>>>
> >>>>> So, any better ideas?
> >>>> Tracers using RCU should use the _notrace() version of read_lock/unlock.
> >>>> I think the callers should be fixed rather than RCU.
> >>>>
> >>>> Tracepoints have been designed to use the _notrace variant on the
> >>>> instrumentation site. The core of tracepoint management use
> >>>> call_rcu_sched(), which can be traced without any problem.
> >>>>
> >>>> I have not followed the late tracing development as closely though, so
> >>>> errors might have crept in.
> >>> Or I might have inadvertently broken something in a non-obvious (to me,
> >>> anyway) manner.
> >>>
> >>> So, would you be willing to look at commit bc33f24bd in the -tip
> >>> tree and see if there is anything I broke other than the now-fixed
> >>> rcu_read_lock_sched_notrace() and rcu_read_unlock_sched_notrace()?
> >>> And for that matter, whether my alleged fix for these two API members
> >>> really does fix the problem (-tip commit 7c614d6461)?
> >>>
> >>> The -tip tree is at:
> >>>
> >>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git
> >>>
> >>> And these patches are on the tip/core/rcu branch.
> >>>
> >> sure, here we go:
> >>
> >> static inline void rcu_read_lock_sched_notrace(void)
> >> {
> >> preempt_disable_notrace();
> >> + __acquire(RCU_SCHED);
> >> + rcu_read_acquire();
> >> }
> >>
> >> and
> >>
> >>
> >> static inline void rcu_read_unlock_sched_notrace(void)
> >> {
> >> + rcu_read_release();
> >> + __release(RCU_SCHED);
> >> preempt_enable_notrace();
> >> }
> >>
> >> will make those _notrace primitives call into lockdep. I don't think
> >> this is correct, and this might be causing your problem.
> >>
> >> rcu_read_acquire/release are calling lock_acquire/release, those should
> >> be removed.
> >>
> >> __acquire() simply seems to be defined to a gcc "context" attribute,
> >> probably for the sparse checker. I think it should be safe to leave them
> >> there.
> >
> > Thank you for looking this over, Mathieu!!! Please see below for patch.
> >
> > Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > Remove lockdep annotations from RCU's _notrace() API members.
> >
> > The lockdep annotations rcu_read_acquire() and rcu_read_release()
> > might lead to infinite looping if called from lockdep. So this patch
> > removes them.
> >
> > Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx>
> > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> >
> > rcupdate.h | 2 --
> > 1 file changed, 2 deletions(-)
> >
> > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > index 8b4422c..95e0615 100644
> > --- a/include/linux/rcupdate.h
> > +++ b/include/linux/rcupdate.h
> > @@ -195,7 +195,6 @@ static inline notrace void rcu_read_lock_sched_notrace(void)
> > {
> > preempt_disable_notrace();
> > __acquire(RCU_SCHED);
> > - rcu_read_acquire();
> > }
> >
> > /*
> > @@ -211,7 +210,6 @@ static inline void rcu_read_unlock_sched(void)
> > }
> > static inline notrace void rcu_read_unlock_sched_notrace(void)
> > {
> > - rcu_read_release();
> > __release(RCU_SCHED);
> > preempt_enable_notrace();
> > }
>
> I remembered I have suggested this ...

Ah!!! I understood the "add notrace" suggestion, but was confused into
thinking that "notrace" was going to prevent it from being called the
second time.

> Reviewed-by: Lai Jiangshan <laijs@xxxxxxxxxxxxxx>

Thank you!!!

Thanx, Paul

> Lai Jiangshan wrote:
> >
> >> {
> >> preempt_disable_notrace();
> >> + __acquire(RCU_SCHED);
> >> + rcu_read_acquire();
> >> }
> >>
> >
> > It may cause infinity recursion.
> > rcu_read_acquire() calls rcu_read_lock_sched_notrace()
> > before current->lockdep_recursion is set to 1 when tracing in on,
> > thus infinity recursion occurs.
> >
> > Lai
> >
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/