Re: [PATCH] time,virt: resync steal time when guest & host lose sync

From: Rik van Riel
Date: Sun Aug 14 2016 - 04:59:34 EST


On Sat, 2016-08-13 at 10:42 +0200, Ingo Molnar wrote:
> * Rik van Riel <riel@xxxxxxxxxx> wrote:
>
> > On Wed, 10 Aug 2016 07:39:08 +0800
> > Wanpeng Li <kernellwp@xxxxxxxxx> wrote:
> >
> > > The regression is caused by your commit "sched,time: Count
> > > actually
> > > elapsed irq & softirq time".
> >
> > Wanpeng, does this patch fix your issue?
> >
> > Paolo, what is your opinion on this issue?
> >
> > I can think of all kinds of ways in which guest and host might lose
> > sync with steal time, from uninitialized values at boot, to guest
> > pause, followed by save to disk, and reload, to live migration,
> > to...
> >
> > ---8<---
> >
> > Subject: time,virt: resync steal time when guest & host lose sync
> >
> > When guest and host wildly disagree on steal time, a guest can
> > do several things:
> > 1) Quickly account all the steal time at once (the kernel did this
> > before
> > ÂÂÂ57430218317e ("sched/cputime: Count actually elapsed irq &
> > softirq time"),
> > ÂÂÂwhen steal_account_process_ticks got ULONG_MAX as its maximum
> > value.
> > 2) Stay out of sync for an indeterminate amount of time. This is
> > what the
> > ÂÂÂsystem does today.
> > 3) Sync up the guest value to the host-provided value, without
> > accounting
> > ÂÂÂan absurdly large value in the cpu time statistics.
> >
> > This patch makes the kernel do (3), which seems like the right
> > thing
> > to do.
> >
> > The exact value of the threshold use probably does not matter too
> > much,
> > as long as it is long enough to cover all the timer ticks that
> > passed
> > during an idle period, because (irqtime_)account_idle_ticks can
> > process
> > a large amount of time all at once.
> >
> > Signed-off-by: Rik van Riel <riel@xxxxxxxxxx>
> > Reported-by: Wanpeng Li <kernellwp@xxxxxxxxx>
> > ---
> > Âkernel/sched/cputime.c | 12 +++++++++++-
> > Â1 file changed, 11 insertions(+), 1 deletion(-)
>
> fails to build on x86 allnoconfig:
>
> Â kernel/sched/cputime.c:524:10: error: too many arguments to
> function âsteal_account_process_timeâ

Which patch did you apply? ÂThe subject and comment
of the email suggest you tried applying the one
Paolo and Frederic objected to.

The compile error suggest you applied the patch with the
subject "time,virt: do not limit steal_account_process_time"

In that case, did you apply Wanpeng's patch that adds an
additional call site for steal_account_process_time?

I do not have that patch in my tree yet, and one additional
line of change will be needed.

--

All Rights Reversed.

Attachment: signature.asc
Description: This is a digitally signed message part