Re: [PATCH 4.12 178/196] sched/cputime: Accumulate vtime on top of nsec clocksource

From: Mel Gorman
Date: Wed Jul 26 2017 - 10:21:30 EST


I noticed relatively late and it hadn't occured on my own testing. I can
send it as a separate patch but I wanted to highlight that it would affect
a 4.12.4 release.

---8<---
sched/cputime: Don't use smp_processor_id() in preemptible context

commit 0e4097c3354e2f5a5ad8affd9dc7f7f7d00bb6b9 upstream

Recent kernels trigger this warning:

BUG: using smp_processor_id() in preemptible [00000000] code: 99-trinity/181
caller is debug_smp_processor_id+0x17/0x19
CPU: 0 PID: 181 Comm: 99-trinity Not tainted 4.12.0-01059-g2a42eb9 #1
Call Trace:
dump_stack+0x82/0xb8
check_preemption_disabled()
debug_smp_processor_id()
vtime_delta()
task_cputime()
thread_group_cputime()
thread_group_cputime_adjusted()
wait_consider_task()
do_wait()
SYSC_wait4()
do_syscall_64()
entry_SYSCALL64_slow_path()

As Frederic pointed out:

| Although those sched_clock_cpu() things seem to only matter when the
| sched_clock() is unstable. And that stability is a condition for nohz_full
| to work anyway. So probably sched_clock() alone would be enough.

This patch fixes it by replacing sched_clock_cpu() with sched_clock() to
avoid calling smp_processor_id() in a preemptible context.

Reported-by: Xiaolong Ye <xiaolong.ye@xxxxxxxxx>
Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
Cc: Frederic Weisbecker <fweisbec@xxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Luiz Capitulino <lcapitulino@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Rik van Riel <riel@xxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Link: http://lkml.kernel.org/r/1499586028-7402-1-git-send-email-wanpeng.li@xxxxxxxxxxx
[ Prettified the changelog. ]
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
Signed-off-by: Mel Gorman <mgorman@xxxxxxx>

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 3dafea579263..ad275b25e3f9 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -683,7 +683,7 @@ static u64 vtime_delta(struct vtime *vtime)
{
unsigned long long clock;

- clock = sched_clock_cpu(smp_processor_id());
+ clock = sched_clock();
if (clock < vtime->starttime)
return 0;

@@ -814,7 +814,7 @@ void arch_vtime_task_switch(struct task_struct *prev)

write_seqcount_begin(&vtime->seqcount);
vtime->state = VTIME_SYS;
- vtime->starttime = sched_clock_cpu(smp_processor_id());
+ vtime->starttime = sched_clock();
write_seqcount_end(&vtime->seqcount);
}

@@ -826,7 +826,7 @@ void vtime_init_idle(struct task_struct *t, int cpu)
local_irq_save(flags);
write_seqcount_begin(&vtime->seqcount);
vtime->state = VTIME_SYS;
- vtime->starttime = sched_clock_cpu(cpu);
+ vtime->starttime = sched_clock();
write_seqcount_end(&vtime->seqcount);
local_irq_restore(flags);
}