Re: [RFC GIT PULL] nohz: Basic cputime accounting for adaptivetickless

From: Frederic Weisbecker
Date: Fri Jun 15 2012 - 13:37:33 EST


On Thu, Jun 14, 2012 at 05:18:00PM +0200, Martin Schwidefsky wrote:
> On Thu, 14 Jun 2012 15:42:44 +0200
> Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
>
> > On Thu, Jun 14, 2012 at 02:48:15PM +0200, Martin Schwidefsky wrote:
> > > On Thu, 14 Jun 2012 13:22:45 +0200
> > > Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
> > >
> > > > On Thu, Jun 14, 2012 at 01:21:23PM +0200, Thomas Gleixner wrote:
> > > > > On Thu, 14 Jun 2012, Ingo Molnar wrote:
> > > > > > * Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
> > > > > > > You're right, I should have looked into CONFIG_VIRT_CPU_ACCOUNTING sooner
> > > > > > > and see if I can reuse it.
> > > > > > >
> > > > > > > I'll try something with that.
> > > > > >
> > > > > > Maybe sanitize all the variants under a single set of
> > > > > > wrappers/callbacks?
> > > > >
> > > > > Yes, please!
> > > >
> > > > Sure, I'm working in it.
> > >
> > > Please keep me in the loop, I want to avoid that things break on s390. Thanks.
> >
> > Do you have any idea why s390 counts idle time from asm deep in the idle code
> > rather than just hooking in account_system_vtime() like ppc or ia64?
>
> Well what is idle time? For s390 it is the difference in the TOD clock between
> the instruction that loaded the enabled-wait-PSW and the first instruction on
> the interrupt handler. To get the best precision you need to get the TOD time
> stamps as close to these two instructions as possible. For s390 it is the
> following sequence:
>
> STCK __IDLE_ENTER(%r2) # idle enter time stamp
> ltr %r5,%r5
> stpt __VQ_IDLE_ENTER(%r3)
> jz psw_idle_lpsw
> spt 0(%r1)
> psw_idle_lpsw:
> lpswe __SF_EMPTY(%r15)
>
> <<< sleeping >>>
>
> int_handler:
> STCK __LC_INT_CLOCK # idle exit time stamp
>
> There are at maximum 5 instructions between the STCK for the idle
> enter time stamp and the lpswe that puts the cpu to sleep.

I see. So s390 accounts only the time spent in low power mode whereas
ppc/ia64 accounts everything that happens in the idle task.

I don't know which one has chosen the right semantics but this complicates
any possible unification.

BTW, aren't you accounting the idle time as system time as well with
account_sys_vtime()?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/