Re: [PATCH tip/core/rcu 02/18] rcu: Move rcu_report_exp_rnp() to allow consolidation
From: Paul E. McKenney
Date: Wed Oct 07 2015 - 10:33:38 EST
On Wed, Oct 07, 2015 at 09:51:14AM +0200, Peter Zijlstra wrote:
> On Tue, Oct 06, 2015 at 01:58:50PM -0700, Paul E. McKenney wrote:
> > On Tue, Oct 06, 2015 at 10:29:37PM +0200, Peter Zijlstra wrote:
> > > On Tue, Oct 06, 2015 at 09:29:21AM -0700, Paul E. McKenney wrote:
> > > > +static void __maybe_unused rcu_report_exp_rnp(struct rcu_state *rsp,
> > > > + struct rcu_node *rnp, bool wake)
> > > > +{
> > > > + unsigned long flags;
> > > > + unsigned long mask;
> > > > +
> > > > + raw_spin_lock_irqsave(&rnp->lock, flags);
> > >
> > > Normally we require a comment with barriers, explaining the order and
> > > the pairing etc.. :-)
> > >
> > > > + smp_mb__after_unlock_lock();
> >
> > Hmmmm... That is not good.
> >
> > Worse yet, I am missing comments on most of the pre-existing barriers
> > of this form.
>
> Yes I noticed.. :/
Will fix, though probably as a follow-up patch. Once I figure out what
comment makes sense...
> > The purpose is to enforce the heavy-weight grace-period memory-ordering
> > guarantees documented in the synchronize_sched() header comment and
> > elsewhere.
>
> > They pair with anything you might use to check for violation
> > of these guarantees, or, simiarly, any ordering that you might use when
> > relying on these guarantees.
>
> I'm sure you know what that means, but I've no clue ;-) That is, I
> wouldn't know where to start looking in the RCU implementation to verify
> the barrier is either needed or sufficient. Unless you mean _everywhere_
> :-)
Pretty much everywhere.
Let's take the usual RCU removal pattern as an example:
void f1(struct foo *p)
{
list_del_rcu(p);
synchronize_rcu_expedited();
kfree(p);
}
void f2(void)
{
struct foo *p;
list_for_each_entry_rcu(p, &my_head, next)
do_something_with(p);
}
So the synchronize_rcu_expedited() acts as an extremely heavyweight
memory barrier that pairs with the rcu_dereference() inside of
list_for_each_entry_rcu(). Easy enough, right?
But what exactly within synchronize_rcu_expedited() provides the
ordering? The answer is a web of lock-based critical sections and
explicit memory barriers, with the one you called out as needing
a comment being one of them.
> > I could add something like "/* Enforce GP memory ordering. */"
> >
> > Or perhaps "/* See synchronize_sched() header. */"
> >
> > I do not propose reproducing the synchronize_sched() header on each
> > of these. That would be verbose, even for me! ;-)
> >
> > Other thoughts?
>
> Well, this is an UNLOCK+LOCK on non-matching lock variables upgrade to
> full barrier thing, right?
Yep!
> To me its not clear which UNLOCK we even match here. I've just read the
> sync_sched() header, but that doesn't help me either, so referring to
> that isn't really helpful either.
Usually this pairs with an rcu_dereference() somewhere in the calling
code. Some other task in the calling code, actually.
> In any case, I don't want to make too big a fuzz here, but I just
> stumbled over a lot of unannotated barriers and figured I ought to say
> something about it.
I do need to better document how this works, no two ways about it.
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/