tick/sched: Make jiffies update quick check more robust

From: Thomas Gleixner
Date: Fri Dec 04 2020 - 05:56:15 EST


The quick check in tick_do_update_jiffies64() whether jiffies need to be
updated is not really correct under all circumstances and on all
architectures, especially not on 32bit systems.

The quick check does:

if (now < READ_ONCE(tick_next_period))
return;

and the counterpart in the update is:

WRITE_ONCE(tick_next_period, next_update_time);

This has two problems:

1) On weakly ordered architectures there is no guarantee that the stores
before the WRITE_ONCE() are visible which means that other CPUs can
operate on a stale jiffies value.

2) On 32bit the store of tick_next_period which is an u64 is split into
two 32bit stores. If the first 32bit store advances tick_next_period
far out and the second 32bit store is delayed (virt, NMI ...) then
jiffies will become stale until the second 32bit store happens.

Address this by seperating the handling for 32bit and 64bit.

On 64bit problem #1 is addressed by replacing READ_ONCE() / WRITE_ONCE()
with smp_load_acquire() / smp_store_release().

On 32bit problem #2 is addressed by protecting the quick check with the
jiffies sequence counter. The load and stores can be plain because the
sequence count mechanics provides the required barriers already.

Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
---
Applies on tip timers/core
---
kernel/time/tick-sched.c | 74 +++++++++++++++++++++++++++++------------------
1 file changed, 47 insertions(+), 27 deletions(-)

--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -57,36 +57,42 @@ static ktime_t last_jiffies_update;
static void tick_do_update_jiffies64(ktime_t now)
{
unsigned long ticks = 1;
- ktime_t delta;
+ ktime_t delta, nextp;

/*
- * Do a quick check without holding jiffies_lock. The READ_ONCE()
+ * 64bit can do a quick check without holding jiffies lock and
+ * without looking at the sequence count. The smp_load_acquire()
* pairs with the update done later in this function.
*
- * This is also an intentional data race which is even safe on
- * 32bit in theory. If there is a concurrent update then the check
- * might give a random answer. It does not matter because if it
- * returns then the concurrent update is already taking care, if it
- * falls through then it will pointlessly contend on jiffies_lock.
- *
- * Though there is one nasty case on 32bit due to store tearing of
- * the 64bit value. If the first 32bit store makes the quick check
- * return on all other CPUs and the writing CPU context gets
- * delayed to complete the second store (scheduled out on virt)
- * then jiffies can become stale for up to ~2^32 nanoseconds
- * without noticing. After that point all CPUs will wait for
- * jiffies lock.
- *
- * OTOH, this is not any different than the situation with NOHZ=off
- * where one CPU is responsible for updating jiffies and
- * timekeeping. If that CPU goes out for lunch then all other CPUs
- * will operate on stale jiffies until it decides to come back.
+ * 32bit cannot do that because the store of tick_next_period
+ * consists of two 32bit stores and the first store could move it
+ * to a random point in the future.
*/
- if (ktime_before(now, READ_ONCE(tick_next_period)))
- return;
+ if (IS_ENABLED(CONFIG_64BIT)) {
+ if (ktime_before(now, smp_load_acquire(&tick_next_period)))
+ return;
+ } else {
+ unsigned int seq;
+
+ /*
+ * Avoid contention on jiffies_lock and protect the quick
+ * check with the sequence count.
+ */
+ do {
+ seq = read_seqcount_begin(&jiffies_seq);
+ nextp = tick_next_period;
+ } while (read_seqcount_retry(&jiffies_seq, seq));
+
+ if (ktime_before(now, nextp))
+ return;
+ }

- /* Reevaluate with jiffies_lock held */
+ /* Quick check failed, i.e. update is required. */
raw_spin_lock(&jiffies_lock);
+ /*
+ * Reevaluate with the lock held. Another CPU might have done the
+ * update already.
+ */
if (ktime_before(now, tick_next_period)) {
raw_spin_unlock(&jiffies_lock);
return;
@@ -112,11 +118,25 @@ static void tick_do_update_jiffies64(kti
jiffies_64 += ticks;

/*
- * Keep the tick_next_period variable up to date. WRITE_ONCE()
- * pairs with the READ_ONCE() in the lockless quick check above.
+ * Keep the tick_next_period variable up to date.
*/
- WRITE_ONCE(tick_next_period,
- ktime_add_ns(last_jiffies_update, TICK_NSEC));
+ nextp = ktime_add_ns(last_jiffies_update, TICK_NSEC);
+
+ if (IS_ENABLED(CONFIG_64BIT)) {
+ /*
+ * Pairs with smp_load_acquire() in the lockless quick
+ * check above and ensures that the update to jiffies_64 is
+ * not reordered vs. the store to tick_next_period, neither
+ * by the compiler nor by the CPU.
+ */
+ smp_store_release(&tick_next_period, nextp);
+ } else {
+ /*
+ * A plain store is good enough on 32bit as the quick check
+ * above is protected by the sequence count.
+ */
+ tick_next_period = nextp;
+ }

/*
* Release the sequence count. calc_global_load() below is not