Re: [PATCH 5/5] sched: Accumulate vtime on top of nsec clocksource

From: Frederic Weisbecker
Date: Wed Jul 05 2017 - 09:20:42 EST


On Thu, Jun 29, 2017 at 07:27:27PM -0400, Rik van Riel wrote:
> On Thu, 2017-06-29 at 19:15 +0200, Frederic Weisbecker wrote:
> > From: Wanpeng Li <kernellwp@xxxxxxxxx>
> >
> > Currently the cputime source used by vtime is jiffies. When we cross
> > a context boundary and jiffies have changed since the last snapshot,
> > the
> > pending cputime is accounted to the switching out context.
> >
> > This system works ok if the ticks are not aligned across CPUs. If
> > they
> > instead are aligned (ie: all fire at the same time) and the CPUs run
> > in
> > userspace, the jiffies change is only observed on tick exit and
> > therefore
> > the user cputime is accounted as system cputime. This is because the
> > CPU that maintains timekeeping fires its tick at the same time as the
> > others. It updates jiffies in the middle of the tick and the other
> > CPUs
> > see that update on IRQ exit:
> >
> >     CPU 0 (timekeeper)                  CPU 1
> >     -------------------              -------------
> >                       jiffies = N
> >     ...                              run in userspace for a jiffy
> >     tick entry                       tick entry (sees jiffies = N)
> >     set jiffies = N + 1
> >     tick exit                        tick exit (sees jiffies = N + 1)
> >                                                 account 1 jiffy as
> > stime
> >
> > Fix this with using a nanosec clock source instead of jiffies. The
> > cputime is then accumulated and flushed everytime the pending delta
> > reaches a jiffy in order to mitigate the accounting overhead.
>
> Glad to hear this could be done without dramatically
> increasing the accounting overhead!

Lets hope so, I actually haven't yet measured if there is a
performance delta :-s

If any I don't expect a big one.

Thanks for your reviews!