[PATCH 19/24] jrcu: bugfix: init cpu wait state on every scan

From: Joe Korty
Date: Thu Mar 24 2011 - 13:49:04 EST


jrcu: re-init cpu wait state on every scan, not just at
scans that mark beginning of batch.

This fixes a hard to hit bug. To have a chance of hitting
it, these conditions must be true: we have a cpu running a
user application 100% of the time, not making any system
calls, and no interrupts of any type being delivered to
that cpu.

jrcu is designed to allow transitioning values of every
description to be fuzzy for a while before settling down.
Therefore, if a batch ends (and a new one starts) at about
the time a cpu is transitioning from a normal state to the
above mentioned user-dedicated state, the value the cpu
->wait state is set to will be somewhat random. That is,
most of the time it will be correct but on occasion it
will take the opposite value. This is OK, it is expected,
but for things to work we must periodically re-sample and
re-init the ->wait state so that later on, we will catch
the sampled value again, after it has become stable.

Without periodic re-sampling we could set ->wait =1 when it
should be =0, and once it is =1 it will stay =1 (because a
user-dedicated cpu crosses no quiescent point taps which
by definition would set ->wait =0). JRCU thus stops
advancing batches until the watchdog fires and tickles
the offending cpu.

Signed-off-by: Joe Korty <joe.korty@xxxxxxxx>

Index: b/kernel/jrcu.c
===================================================================
--- a/kernel/jrcu.c
+++ b/kernel/jrcu.c
@@ -319,8 +319,11 @@ static void __rcu_delimit_batches(struct
for_each_online_cpu(cpu) {
rd = &rcu_data[cpu];
if (rd->wait) {
- eob = 0;
- break;
+ rd->wait = preempt_count_cpu(cpu) > idle_cpu(cpu);
+ if (rd->wait) {
+ eob = 0;
+ break;
+ }
}
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/