Re: [PATCH tip/core/rcu 08/10] rcu: Add a TINY_PREEMPT_RCU
From: Mathieu Desnoyers
Date: Mon Aug 16 2010 - 15:19:54 EST
* Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
> On Mon, Aug 16, 2010 at 11:07:37AM -0400, Mathieu Desnoyers wrote:
> > * Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
> > [...]
> > > +
> > > +/*
> > > + * Tiny-preemptible RCU implementation for rcu_read_unlock().
> > > + * Decrement ->rcu_read_lock_nesting. If the result is zero (outermost
> > > + * rcu_read_unlock()) and ->rcu_read_unlock_special is non-zero, then
> > > + * invoke rcu_read_unlock_special() to clean up after a context switch
> > > + * in an RCU read-side critical section and other special cases.
> > > + */
> > > +void __rcu_read_unlock(void)
> > > +{
> > > + struct task_struct *t = current;
> > > +
> > > + barrier(); /* needed if we ever invoke rcu_read_unlock in rcutiny.c */
> > > + if (--t->rcu_read_lock_nesting == 0 &&
> > > + unlikely(t->rcu_read_unlock_special))
>
> First, thank you for looking this over!!!
>
> > Hrm I think we discussed this in a past life, but would the following
> > sequence be possible and correct ?
> >
> > CPU 0
> >
> > read t->rcu_read_unlock_special
> > interrupt comes in, preempts. sets t->rcu_read_unlock_special
> > <preempted>
> > <scheduled back>
> > iret
> > decrement and read t->rcu_read_lock_nesting
> > test both old "special" value (which we have locally on the stack) and
> > detect that rcu_read_lock_nesting is 0.
> >
> > We actually missed a reschedule.
> >
> > I think we might need a barrier() between the t->rcu_read_lock_nesting
> > and t->rcu_read_unlock_special reads.
>
> You are correct -- I got too aggressive in eliminating synchronization.
>
> Good catch!!!
>
> I added an ACCESS_ONCE() to the second term of the "if" condition so
> that it now reads:
>
> if (--t->rcu_read_lock_nesting == 0 &&
> unlikely((ACCESS_ONCE(t->rcu_read_unlock_special)))
>
> This prevents the compiler from reordering because the ACCESS_ONCE()
> prohibits accessing t->rcu_read_unlock_special unless the value of
> t->rcu_read_lock_nesting is known to be zero.
Hrm, --t->rcu_read_lock_nesting does not have any globally visible
side-effect, so the compiler is free to reorder the memory access across
the rcu_read_unlock_special access. I think we need the ACCESS_ONCE()
around the t->rcu_read_lock_nesting access too.
>
> > We might need to audit
> > TREE PREEMPT RCU for the same kind of behavior.
>
> The version of __rcu_read_unlock() in kernel/rcutree_plugin.h is as
> follows:
>
> void __rcu_read_unlock(void)
> {
> struct task_struct *t = current;
>
> barrier(); /* needed if we ever invoke rcu_read_unlock in rcutree.c */
> if (--ACCESS_ONCE(t->rcu_read_lock_nesting) == 0 &&
> unlikely(ACCESS_ONCE(t->rcu_read_unlock_special)))
This seem to work because we have:
volatile access (read/update t->rcu_read_lock_nesting)
&& (sequence point)
volatile access (t->rcu_read_unlock_special)
The C standard seems to forbid reordering of volatile accesses across
sequence points, so this should be fine. But it would probably be good
to document this implied ordering explicitly.
> rcu_read_unlock_special(t);
> #ifdef CONFIG_PROVE_LOCKING
> WARN_ON_ONCE(ACCESS_ONCE(t->rcu_read_lock_nesting) < 0);
> #endif /* #ifdef CONFIG_PROVE_LOCKING */
> }
>
> The ACCESS_ONCE() calls should cover this. I believe that the first
> ACCESS_ONCE() is redundant, and have checking this more closely on my
> todo list.
I doubt so, see explanation above.
>
> > But I might be (again ?) missing something. I've got the feeling you
> > already convinced me that this was OK for some reason, but I trip on
> > this every time I read the code.
> >
> > [...]
> >
> > > +/*
> > > + * Check for a task exiting while in a preemptible -RCU read-side
> > > + * critical section, clean up if so. No need to issue warnings,
> > > + * as debug_check_no_locks_held() already does this if lockdep
> > > + * is enabled.
> > > + */
> > > +void exit_rcu(void)
> > > +{
> > > + struct task_struct *t = current;
> > > +
> > > + if (t->rcu_read_lock_nesting == 0)
> > > + return;
> > > + t->rcu_read_lock_nesting = 1;
> > > + rcu_read_unlock();
> > > +}
> > > +
> >
> > The interaction with preemption is unclear here. exit.c disables
> > preemption around the call to exit_rcu(), but if, for some reason,
> > rcu_read_unlock_special was set earlier by preemption, then the
> > rcu_read_unlock() code might block and cause problems.
>
> But rcu_read_unlock_special() does not block. In fact, it disables
> interrupts over almost all of its execution. Or am I missing some
> subtlety here?
I am probably the one who was missing a subtlety about how
rcu_read_unlock_special() works.
>
> > Maybe we should consider clearing rcu_read_unlock_special here ?
>
> If the task blocked in an RCU read-side critical section just before
> exit_rcu() was called, we need to remove the task from the ->blkd_tasks
> list. If we fail to do so, we might get a segfault later on. Also,
> we do need to handle any RCU_READ_UNLOCK_NEED_QS requests from the RCU
> core.
>
> So I really do like the current approach of calling rcu_read_unlock()
> to do this sort of cleanup.
It looks good then, I just wanted to ensure that the side-effects of
calling rcu_read_unlock() in this code path were well-thought.
Thanks,
Mathieu
>
> Thanx, Paul
>
> > Thanks,
> >
> > Mathieu
> >
> > --
> > Mathieu Desnoyers
> > Operating System Efficiency R&D Consultant
> > EfficiOS Inc.
> > http://www.efficios.com
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/