Re: [BUG nohz]: wrong user and system time accounting

From: Luiz Capitulino
Date: Fri Mar 31 2017 - 16:09:23 EST


On Thu, 30 Mar 2017 17:25:46 -0400
Luiz Capitulino <lcapitulino@xxxxxxxxxx> wrote:

> On Thu, 30 Mar 2017 16:18:17 +0200
> Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
>
> > On Thu, Mar 30, 2017 at 09:59:54PM +0800, Wanpeng Li wrote:
> > > 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker <fweisbec@xxxxxxxxx>:
> > > > If it works, we may want to take that solution, likely less performance sensitive
> > > > than using sched_clock(). In fact sched_clock() is fast, especially as we require it to
> > > > be stable for nohz_full, but using it involves costly conversion back and forth to jiffies.
> > >
> > > So both Rik and you agree with the skew tick solution, I will try it
> > > tomorrow. Btw, if we should just add random offset to the cpu in the
> > > nohz_full mode or add random offset to all cpus like the codes above?
> >
> > Lets just keep it to all CPUs for simplicty.
> > Also please add a comment that explains why we need that skew_tick on nohz_full.
>
> I've tried all the test-cases we discussed in this thread with skew_tick=1
> and it worked as expected in bare-metal and KVM guests.
>
> However, I found a test-case that works in bare-metal but show problems
> in KVM guests. It could something that's KVM specific, or it could be
> something that's harder to reproduce in bare-metal.

After discussing some findings on this issue with Rik, I realized that
we don't add the skew when restarting the tick in tick_nohz_restart().
Adding the offset there seems to solve this problem.