[PATCH] sched/cputime: Fix fold steal time into idle time accounting

From: Wanpeng Li
Date: Thu Aug 11 2016 - 01:49:41 EST


From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>

Commit:

57430218317 ("sched/cputime: Count actually elapsed irq & softirq time")

... didn't take steal time into consideration with passing the noirqtime
kernel parameter.

As Paolo pointed out before:

| Why not? If idle=poll, for example, any time the guest is suspended (and
| thus cannot poll) does count as stolen time.

This patch fixes it by reducing steal time from idle time accounting when
the noirqtime is true. The average idle time drops from 56.8% to 54.75%
for nohz idle kvm guest(noirqtime, idle=poll, four vCPUs running on one
pCPU).

Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Cc: Rik van Riel <riel@xxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Frederic Weisbecker <fweisbec@xxxxxxxxx>
Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
Cc: Radim <rkrcmar@xxxxxxxxxx>
Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
---
kernel/sched/cputime.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 1934f65..8b9bcc5 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -508,13 +508,20 @@ void account_process_tick(struct task_struct *p, int user_tick)
*/
void account_idle_ticks(unsigned long ticks)
{
-
+ cputime_t cputime, steal;
if (sched_clock_irqtime) {
irqtime_account_idle_ticks(ticks);
return;
}

- account_idle_time(jiffies_to_cputime(ticks));
+ cputime = cputime_one_jiffy;
+ steal = steal_account_process_time(cputime);
+
+ if (steal >= cputime)
+ return;
+
+ cputime -= steal;
+ account_idle_time(cputime);
}

/*
--
1.9.1