[PATCH] timer: read jiffies once when forwarding base clk

From: Li RongQing
Date: Thu Sep 19 2019 - 08:11:42 EST


The below calltrace was reported, the cause is that timer is delayed
bigger than 3 seconds

Hardware name: New H3C Technologies Co.,Ltd. UniServer R4950 G3/RS41R4950, BIOS 2.00.06 V700R003
Workqueue: events_unbound sched_tick_remote
RIP: 0010:sched_tick_remote+0xee/0x100
...
Call Trace:
process_one_work+0x18c/0x3a0
worker_thread+0x30/0x380
? process_one_work+0x3a0/0x3a0
kthread+0x113/0x130
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x22/0x40
---[ end trace 41bd884127493e39 ]---

then write a program to test timer latency, it can reproduce this issue

static void sched_l_tick(struct timer_list *t)
{
unsigned long delta = jiffies - set_time;

if (delta > 3*HZ)
printk("abnormal %ld %d\n", delta, raw_smp_processor_id());

set_time = jiffies+HZ;
mod_timer(t, jiffies + HZ);
}

further investigation shows jiffies maybe change when advence this base clk,
twice read of jiffies maybe lead to that base clk is bigger than truely next
event, and fire timer is skipped, so read jiffies once,

Signed-off-by: Li RongQing <lirongqing@xxxxxxxxx>
Signed-off-by: Liang ZhiCheng <liangzhicheng@xxxxxxxxx>
---
kernel/time/timer.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 343c7ba33b1c..e2dbd0223635 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1593,24 +1593,27 @@ void timer_clear_idle(void)
static int collect_expired_timers(struct timer_base *base,
struct hlist_head *heads)
{
+ unsigned long jnow;
+
+ jnow = READ_ONCE(jiffies);
/*
* NOHZ optimization. After a long idle sleep we need to forward the
* base to current jiffies. Avoid a loop by searching the bitfield for
* the next expiring timer.
*/
- if ((long)(jiffies - base->clk) > 2) {
+ if ((long)(jnow - base->clk) > 2) {
unsigned long next = __next_timer_interrupt(base);

/*
* If the next timer is ahead of time forward to current
* jiffies, otherwise forward to the next expiry time:
*/
- if (time_after(next, jiffies)) {
+ if (time_after(next, jnow)) {
/*
* The call site will increment base->clk and then
* terminate the expiry loop immediately.
*/
- base->clk = jiffies;
+ base->clk = jnow;
return 0;
}
base->clk = next;
--
2.16.2