On Tue, Jun 17, 2014 at 10:54 AM, Paul E. McKenney
<paulmck@xxxxxxxxxxxxxxxxxx> wrote:
On Mon, Jun 16, 2014 at 10:55:29PM -0400, Pranith Kumar wrote:Sounds like a good case study for using the newly introduced MCS based
This might sound really naive, but please bear with me.Actually, to reduce contention on rnp_root->lock.
force_quiescent_state() used to do a lot of things in the past in addition to
forcing a quiescent state. (In my reading of the mailing list I found state
transitions for one).
Now according to the code, what is being done is multiple callers try to go up
the hierarchy of nodes to see who reaches the root node. The caller reaching the
root node wins and it acquires root node lock and it gets to set rsp->gp_flags!
At each level of the hierarchy we try to acquire fqslock. This is the only place
which actually uses fqslock.
I guess this was being done to avoid the contention on fqslock, but all we are
doing here is setting one flag. This way of acquiring locks might reduce
contention if every update is trying to do some independent work, but here all
we are doing is setting the same flag with same value.
The trick is that the "losers" at each level of ->fqslock acquisition go
away. The "winner" ends up doing the real work of setting RCU_GP_FLAG_FQS.
We can also remove fqslock completely if we do not need this. Also usingThe ->fqslock funnel was needed to avoid lockups on large systems (many
cmpxchg() to set the value of the flag looks like a good idea to avoid taking
the root node lock. Thoughts?
hundreds or even thousands of CPUs). Moving grace-period responsibilities
from softirq to the grace-period kthreads might have reduced contention
sufficienty to make the ->fqslock funnel unnecessary. However, given
that I don't usually have access to such a large system, I will leave it,
at least for the time being.
locks(qspinlock.h).
Waiman, Peter?
Btw, is doing the following a bad idea? It reduces contention on
rnp_root->lock using fqslock
which seems to be the lock which needs to be taken while forcing a
quiescent state:
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index f1ba773..f5a0e7e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2401,34 +2401,24 @@ static void force_quiescent_state(struct rcu_state *rsp)
unsigned long flags;
bool ret;
struct rcu_node *rnp;
- struct rcu_node *rnp_old = NULL;
-
- /* Funnel through hierarchy to reduce memory contention. */
- rnp = per_cpu_ptr(rsp->rda, raw_smp_processor_id())->mynode;
- for (; rnp != NULL; rnp = rnp->parent) {
- ret = (ACCESS_ONCE(rsp->gp_flags)& RCU_GP_FLAG_FQS) ||
- !raw_spin_trylock(&rnp->fqslock);
- if (rnp_old != NULL)
- raw_spin_unlock(&rnp_old->fqslock);
- if (ret) {
- ACCESS_ONCE(rsp->n_force_qs_lh)++;
- return;
- }
- rnp_old = rnp;
+ struct rcu_node *rnp_root = rcu_get_root(rsp);
+
+ if (!raw_spin_trylock(rnp_root->fqslock)) {
+ ACCESS_ONCE(rsp->n_force_qs_lh)++;
+ return; /* Someone is already trying to force */
}