Re: [PATCH v4 1/2] powerpc/time: Only set CONFIG_ARCH_HAS_SCALED_CPUTIME on PPC64

From: Nicholas Piggin
Date: Wed Jun 06 2018 - 21:43:33 EST


On Wed, 6 Jun 2018 14:21:08 +0000 (UTC)
Christophe Leroy <christophe.leroy@xxxxxx> wrote:

> scaled cputime is only meaningfull when the processor has
> SPURR and/or PURR, which means only on PPC64.
>
> Removing it on PPC32 significantly reduces the size of
> vtime_account_system() and vtime_account_idle() on an 8xx:
>
> Before:
> 00000000 l F .text 000000a8 vtime_delta
> 00000280 g F .text 0000010c vtime_account_system
> 0000038c g F .text 00000048 vtime_account_idle
>
> After:
> (vtime_delta gets inlined in the two functions)
> 000001d8 g F .text 000000a0 vtime_account_system
> 00000278 g F .text 00000038 vtime_account_idle
>
> In terms of performance, we also get approximatly 5% improvement on task switch:
> The following small benchmark app is run with perf stat:
>
> void *thread(void *arg)
> {
> int i;
>
> for (i = 0; i < atoi((char*)arg); i++)
> pthread_yield();
> }
>
> int main(int argc, char **argv)
> {
> pthread_t th1, th2;
>
> pthread_create(&th1, NULL, thread, argv[1]);
> pthread_create(&th2, NULL, thread, argv[1]);
> pthread_join(th1, NULL);
> pthread_join(th2, NULL);
>
> return 0;
> }
>
> Before the patch:
>
> ~# perf stat chrt -f 98 ./sched 100000
>
> Performance counter stats for 'chrt -f 98 ./sched 100000':
>
> 8622.166272 task-clock (msec) # 0.955 CPUs utilized
> 200027 context-switches # 0.023 M/sec
>
> After the patch:
>
> ~# perf stat chrt -f 98 ./sched 100000
>
> Performance counter stats for 'chrt -f 98 ./sched 100000':
>
> 8207.090048 task-clock (msec) # 0.958 CPUs utilized
> 200025 context-switches # 0.024 M/sec
>
> Signed-off-by: Christophe Leroy <christophe.leroy@xxxxxx>

This looks okay to me. Nice numbers.

> ---
> v4:
> - Using the correct symbol CONFIG_ARCH_HAS_SCALED_CPUTIME instead of ARCH_HAS_SCALED_CPUTIME
> - Grouped CONFIG_ARCH_HAS_SCALED_CPUTIME related code in dedicated functions to reduce the number of #ifdefs
> - Integrated read_spurr() directly into the related function.
> v3: Rebased following modifications in xmon.c
> v2: added ifdefs in xmon to fix compilation error
>
> arch/powerpc/Kconfig | 2 +-
> arch/powerpc/include/asm/accounting.h | 4 ++
> arch/powerpc/include/asm/cputime.h | 1 -
> arch/powerpc/kernel/time.c | 111 +++++++++++++++++++++-------------
> arch/powerpc/xmon/xmon.c | 4 ++
> 5 files changed, 77 insertions(+), 45 deletions(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index b62a16e2c7cc..735398fd390d 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -142,7 +142,7 @@ config PPC
> select ARCH_HAS_PHYS_TO_DMA
> select ARCH_HAS_PMEM_API if PPC64
> select ARCH_HAS_MEMBARRIER_CALLBACKS
> - select ARCH_HAS_SCALED_CPUTIME if VIRT_CPU_ACCOUNTING_NATIVE
> + select ARCH_HAS_SCALED_CPUTIME if VIRT_CPU_ACCOUNTING_NATIVE && PPC64

I wonder if we could make this depend on PPC_PSERIES or even
PPC_SPLPAR as well? (That would be for a later patch)

Thanks,
Nick