Re: [PATCH] sched/cputime: Fix steal time accounting vs. cpu hotplug

From: Rik van Riel
Date: Fri Mar 04 2016 - 10:15:12 EST


On Fri, 2016-03-04 at 15:59 +0100, Thomas Gleixner wrote:
> On cpu hotplug the steal time accounting can keep a stale rq-
> >prev_steal_time
> value over cpu down and up. So after the cpu comes up again the delta
> calculation in steal_account_process_tick() wreckages itself due to
> the
> unsigned math:
>
> ÂÂÂÂÂÂÂÂ u64 steal = paravirt_steal_clock(smp_processor_id());
> ÂÂÂÂÂÂÂÂÂ
> ÂÂÂÂÂÂÂÂ steal -= this_rq()->prev_steal_time;
>
> So if steal is smaller than rq->prev_steal_time we end up with an
> insane large
> value which then gets added to rq->prev_steal_time, resulting in a
> permanent
> wreckage of the accounting. As a consequence the per cpu stats in
> /proc/stat
> become stale.
>
> Nice trick to tell the world how idle the system is (100%) while the
> cpu is
> 100% busy running tasks. Though we prefer realistic numbers.
>
> None of the accounting values which use a previous value to account
> for
> fractions is reset at cpu hotplug time. update_rq_clock_task() has a
> sanity
> check for prev_irq_time and prev_steal_time_rq, but that sanity check
> solely
> deals with clock warps and limits the /proc/stat visible wreckage.
> The
> prev_time values are still wrong.
>
> Solution is simple: Reset rq->prev_*_time when the cpu is plugged in
> again.
>
> Fixes: commit e6e6685accfa "KVM guest: Steal time accounting"
> Fixes: commit 095c0aa83e52 "sched: adjust scheduler cpu power for
> stolen time"
> Fixes: commit aa483808516c "sched: Remove irq time from available CPU
> power"
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx

Acked-by: Rik van Riel <riel@xxxxxxxxxx>

--
All Rights Reversed.

Attachment: signature.asc
Description: This is a digitally signed message part