Re: [PATCH -tip 0/2] Temporary RCU fixes for notrace and hotplugCPU

From: Mathieu Desnoyers
Date: Tue Aug 25 2009 - 14:02:42 EST


* Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
> On Tue, Aug 25, 2009 at 12:25:49PM -0400, Mathieu Desnoyers wrote:
> > * Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
> > > On Tue, Aug 25, 2009 at 10:00:47AM +0200, Ingo Molnar wrote:
> > > >
> > > > btw., i'm still seeing crashes with the latest RCU bits:
> > > >
> > > > [ 20.621740] Testing event sys_enter_futex: OK
> > > > [ 20.629738] Testing event sys_exit_futex: OK
> > > > [ 20.637737] Testing event lock_acquire: [reboot]
> > > >
> > > > Possibly due to infinite recursion as well. Config attached.
> > >
> > > Color me confused...
> > >
> > > Unless someone has a better idea, I will send in a patch that adds
> > > "notrace" to every RCU API member used by any file in the kernel
> > > that has "trace" in its name (excluding ptrace.c and rcutree_trace.c,
> > > of course). This list is as follows:
> > >
> > > call_rcu()
> > > call_rcu_sched()
> > > rcu_read_lock()
> > > rcu_read_unlock()
> > >
> > > So, any better ideas?
> >
> > Tracers using RCU should use the _notrace() version of read_lock/unlock.
> > I think the callers should be fixed rather than RCU.
> >
> > Tracepoints have been designed to use the _notrace variant on the
> > instrumentation site. The core of tracepoint management use
> > call_rcu_sched(), which can be traced without any problem.
> >
> > I have not followed the late tracing development as closely though, so
> > errors might have crept in.
>
> Or I might have inadvertently broken something in a non-obvious (to me,
> anyway) manner.
>
> So, would you be willing to look at commit bc33f24bd in the -tip
> tree and see if there is anything I broke other than the now-fixed
> rcu_read_lock_sched_notrace() and rcu_read_unlock_sched_notrace()?
> And for that matter, whether my alleged fix for these two API members
> really does fix the problem (-tip commit 7c614d6461)?
>
> The -tip tree is at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git
>
> And these patches are on the tip/core/rcu branch.
>

sure, here we go:

static inline void rcu_read_lock_sched_notrace(void)
{
preempt_disable_notrace();
+ __acquire(RCU_SCHED);
+ rcu_read_acquire();
}

and


static inline void rcu_read_unlock_sched_notrace(void)
{
+ rcu_read_release();
+ __release(RCU_SCHED);
preempt_enable_notrace();
}

will make those _notrace primitives call into lockdep. I don't think
this is correct, and this might be causing your problem.

rcu_read_acquire/release are calling lock_acquire/release, those should
be removed.

__acquire() simply seems to be defined to a gcc "context" attribute,
probably for the sparse checker. I think it should be safe to leave them
there.

Mathieu

> Thanx, Paul

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/