Re: [PATCH tip/core/rcu 02/18] rcu: Move rcu_report_exp_rnp() to allow consolidation
From: Peter Zijlstra
Date: Thu Oct 08 2015 - 05:49:52 EST
On Wed, Oct 07, 2015 at 09:48:58AM -0700, Paul E. McKenney wrote:
> > Some implementation choice requires this barrier upgrade -- and in
> > another email I suggest its the whole tree thing, we need to firmly
> > establish the state of one level before propagating the state up etc.
> >
> > Now I'm not entirely sure this is fully correct, but its the best I
> > could come up.
>
> It is pretty close. Ignoring dyntick idle for the moment, things
> go (very) roughly like this:
>
> o The RCU grace-period kthread notices that a new grace period
> is needed. It initializes the tree, which includes acquiring
> every rcu_node structure's ->lock.
>
> o CPU A notices that there is a new grace period. It acquires
> the ->lock of its leaf rcu_node structure, which forces full
> ordering against the grace-period kthread.
If the kthread took _all_ rcu_node locks, then this does not require the
barrier upgrade because they will share a lock variable.
> o Some time later, that CPU A realizes that it has passed
> through a quiescent state, and again acquires its leaf rcu_node
> structure's ->lock, again enforcing full ordering, but this
> time against all CPUs corresponding to this same leaf rcu_node
> structure that previously noticed quiescent states for this
> same grace period. Also against all prior readers on this
> same CPU.
This again reads like the same lock variable is involved, and therefore
the barrier upgrade is not required for this.
> o Some time later, CPU B (corresponding to that same leaf
> rcu_node structure) is the last of that leaf's group of CPUs
> to notice a quiescent state. It has also acquired that leaf's
> ->lock, again forcing ordering against its prior RCU read-side
> critical sections, but also against all the prior RCU
> read-side critical sections of all other CPUs corresponding
> to this same leaf.
same lock var again..
> o CPU B therefore moves up the tree, acquiring the parent
> rcu_node structures' ->lock. In so doing, it forces full
> ordering against all prior RCU read-side critical sections
> of all CPUs corresponding to all leaf rcu_node structures
> subordinate to the current (non-leaf) rcu_node structure.
And here we iterate the tree and get another lock var involved, here the
barrier upgrade will actually do something.
> o And so on, up the tree.
idem..
> o When CPU C reaches the root of the tree, and realizes that
> it is the last CPU to report a quiescent state for the
> current grace period, its acquisition of the root rcu_node
> structure's ->lock has forced full ordering against all
> RCU read-side critical sections that started before this
> grace period -- on all CPUs.
Right, which makes the full barrier transitivity thing important
> CPU C therefore awakens the grace-period kthread.
> o When the grace-period kthread wakes up, it does cleanup,
> which (you guessed it!) requires acquiring the ->lock of
> each rcu_node structure. This not only forces full ordering
> against each pre-existing RCU read-side critical section,
> it also sets up things so that...
Again, if it takes _all_ rcu_nodes, it also shares a lock variable and
hence the upgrade is not required.
> o When CPU D notices that the grace period ended, it does so
> while holding its leaf rcu_node structure's ->lock. This
> forces full ordering against all relevant RCU read-side
> critical sections. This ordering prevails when CPU D later
> starts invoking RCU callbacks.
Does also not seem to require the upgrade..
> Hey, you asked!!! ;-)
No, I asked what all the barrier upgrade was for, most of the above does
not seem to rely on that at all.
The only place this upgrade matters is the UNLOCK x + LOCK y scenario,
as also per the comment above smp_mb__after_unlock_lock().
Any other ordering is not on this but on the other primitives and
irrelevant to the barrier upgrade.
> Again, this is a cartoon-like view of the ordering that leaves out a
> lot of details, but it should get across the gist of the ordering.
So the ordering I'm interested in, is the bit that is provided by the
barrier upgrade, and that seems very limited and directly pertains to
the tree iteration, ensuring its fully separated and transitive.
So I'll stick to explanation that the barrier upgrade is purely for the
tree iteration, to separate and make transitive the tree level state.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/