[PATCH RFC nohz_full 4/8] nohz_full: Add per-CPU idle-state tracking for NMIs

From: Paul E. McKenney
Date: Tue Jun 25 2013 - 17:38:29 EST


From: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>

It turns out that we can reuse RCU's ->dynticks counter to identify
CPUs that are non-idle due to NMIs from idle, in combination with the
new full-system idle ->dynticks_idle counter. The reason this works
can be seen from the following table:

->dynticks ->dynticks_idle union

NMI from idle: non-idle idle non-idle
NMI from user: non-idle non-idle non-idle
NMI from non-idle kernel: non-idle non-idle non-idle
idle: idle idle idle
user: idle non-idle non-idle
non-idle kernel: non-idle non-idle non-idle

Note that the final "union" column gets us what we need: A non-idle
indication in all cases except when the CPU really is in the idle loop.
(But what about interrupt handlers? They are treated the same as
non-idle kernel.)

Therefore, if both ->dynticks and ->dynticks_idle say that the corresponding
CPU is idle (in other words, both have odd values), then the CPU really
is idle.

The only additional thing that this commit needs to supply is the time
that the last NMI either started or ended for the corresponding CPU.
This is used to determine whether or not this CPU has been idle long
enough to justify updating the global full-system idle state.

Final caveat: This approach assumes that NMI handlers do not access
system time, an assumption that the existing dyntick-idle code also
makes. To see this, suppose that the system has been idle for an
extended period of time, so that the clock values are obsolete, and
that an NMI arrives. The NMI handler has no safe way to update the
clock values, and thus must do without.

Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
Cc: Frederic Weisbecker <fweisbec@xxxxxxxxx>
Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
---
kernel/rcutree.c | 7 +++++--
kernel/rcutree.h | 1 +
kernel/rcutree_plugin.h | 9 +++++++++
3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index c814ce1..02b879a 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -607,6 +607,7 @@ void rcu_nmi_enter(void)
(atomic_read(&rdtp->dynticks) & 0x1))
return;
rdtp->dynticks_nmi_nesting++;
+ rcu_sysidle_nmi_jiffies(rdtp);
smp_mb__before_atomic_inc(); /* Force delay from prior write. */
atomic_inc(&rdtp->dynticks);
/* CPUs seeing atomic_inc() must see later RCU read-side crit sects */
@@ -625,8 +626,10 @@ void rcu_nmi_exit(void)
{
struct rcu_dynticks *rdtp = &__get_cpu_var(rcu_dynticks);

- if (rdtp->dynticks_nmi_nesting == 0 ||
- --rdtp->dynticks_nmi_nesting != 0)
+ if (rdtp->dynticks_nmi_nesting == 0)
+ return;
+ rcu_sysidle_nmi_jiffies(rdtp);
+ if (--rdtp->dynticks_nmi_nesting != 0)
return;
/* CPUs seeing atomic_inc() must see prior RCU read-side crit sects */
smp_mb__before_atomic_inc(); /* See above. */
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index a56d1f1..11d7144 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -557,6 +557,7 @@ static void rcu_kick_nohz_cpu(int cpu);
static bool init_nocb_callback_list(struct rcu_data *rdp);
static void rcu_sysidle_enter(struct rcu_dynticks *rdtp, int irq);
static void rcu_sysidle_exit(struct rcu_dynticks *rdtp, int irq);
+static void rcu_sysidle_nmi_jiffies(struct rcu_dynticks *rdtp);
static void rcu_sysidle_init_percpu_data(struct rcu_dynticks *rdtp);

#endif /* #ifndef RCU_TREE_NONCORE */
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index b704979..a00d5c9 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -2450,6 +2450,11 @@ static void rcu_sysidle_exit(struct rcu_dynticks *rdtp, int irq)
WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks_idle) & 0x1));
}

+static inline void rcu_sysidle_nmi_jiffies(struct rcu_dynticks *rdtp)
+{
+ rdtp->dynticks_nmi_jiffies = jiffies;
+}
+
/*
* Initialize dynticks sysidle state for CPUs coming online.
*/
@@ -2468,6 +2473,10 @@ static void rcu_sysidle_exit(struct rcu_dynticks *rdtp, int irq)
{
}

+static inline void rcu_sysidle_nmi_jiffies(struct rcu_dynticks *rdtp)
+{
+}
+
static void rcu_sysidle_init_percpu_data(struct rcu_dynticks *rdtp)
{
}
--
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/