Re: [PATCH 1/1] cpufreq: pcc-cpufreq: Re-introduce deadband effect to reduce number of frequency changes

From: Andreas Herrmann
Date: Tue Oct 11 2016 - 02:30:32 EST


On Wed, Oct 05, 2016 at 10:47:53AM +0530, Viresh Kumar wrote:
> Sorry for being late Andreas :(
>
> On 14-09-16, 16:56, Andreas Herrmann wrote:
>
> First of all, thanks for your hardwork in getting these numbers out.
> Really appreciate it.
>
> > Below is some trace data. I hope it is of some help.
> >
> > (A) - sampling 10s period when system is idle
> > (B) - sampling 10s period when system partially loaded (kernel
> > compilation using 2 jobs)
> >
> > (1) 4.8-rc5
> > (2) 4.8-rc5 with my patch (reintro of deadband effect within
> > pcc-cpufreq)
> > (3) 4.8-rc5 with reversal of 6393d6a102 (cpufreq: ondemand: Eliminate
> > the deadband effect)
> >
> > Let me know whether you are looking for other trace data wrt this
> > issue.
> >
> >
> > Thanks,
> >
> > Andreas
> >
> > ---
> >
> > (A)-(1)
> >
> > # Total Lost Samples: 0
> > # Samples: 41 of event 'power:cpu_frequency'
> > # Event count (approx.): 41
> > # Overhead Command Shared Object Symbol
> > # ........ ............ ................ .............................
> > 39.02% kworker/14:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 29.27% kworker/0:0 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 19.51% kworker/10:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 7.32% kworker/5:2 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 2.44% kworker/23:2 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 2.44% kworker/40:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> >
> > (A)-(2)
> >
> > # Total Lost Samples: 0
> > # Samples: 6 of event 'power:cpu_frequency'
> > # Event count (approx.): 6
> > # Overhead Command Shared Object Symbol
> > # ........ ............ ................ .............................
> > 33.33% kworker/1:2 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 16.67% kworker/16:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 16.67% kworker/22:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 16.67% kworker/26:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 16.67% kworker/33:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> >
> > (A)-(3)
> >
> > # Total Lost Samples: 0
> > # Samples: 7 of event 'power:cpu_frequency'
> > # Event count (approx.): 7
> > # Overhead Command Shared Object Symbol
> > # ........ ............ ................ .............................
> > 28.57% kworker/58:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 14.29% kworker/19:2 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 14.29% kworker/20:2 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 14.29% kworker/22:2 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 14.29% kworker/23:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 14.29% kworker/35:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> >
> > ---
> >
> > (B)-(1)
> >
> > # Total Lost Samples: 0
> > # Samples: 2K of event 'power:cpu_frequency'
> > # Event count (approx.): 2382
> > # Overhead Command Shared Object Symbol
> > # ........ ............ ................ .............................
> > 5.75% kworker/0:0 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 4.16% kworker/12:2 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 3.11% kworker/17:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 2.94% kworker/2:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 2.73% kworker/19:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> > ...
> >
> > (B)-(2)
> >
> > # Total Lost Samples: 0
> > # Samples: 320 of event 'power:cpu_frequency'
> > # Event count (approx.): 320
> > # Overhead Command Shared Object Symbol
> > # ........ ............ ................ .............................
> > 4.69% kworker/56:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 4.06% kworker/12:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 4.06% kworker/28:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 4.06% kworker/6:2 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 3.75% kworker/32:2 [kernel.vmlinux] [k] cpufreq_notify_transition
> > ...
> >
> > (B)-(3)
> >
> > # Total Lost Samples: 0
> > # Samples: 333 of event 'power:cpu_frequency'
> > # Event count (approx.): 333
> > # Overhead Command Shared Object Symbol
> > # ........ ............ ................ .............................
> > 4.80% kworker/51:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 4.50% kworker/39:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 4.20% kworker/47:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 3.90% kworker/59:1 [kernel.vmlinux] [k] cpufreq_notify_transition
> > 3.90% kworker/7:2 [kernel.vmlinux] [k] cpufreq_notify_transition
> > ...
>
> I am worried by all of these.. So, its not the DVFS transition which
> took time, but cpufreq_notify_transition(). And probably one of the
> drivers in your setup is screwing up here, which has registered with
> cpufreq_register_notifier().
>
> Can you please take a look at that ? Just check which all routines are
> getting called as part of srcu_notifier_call_chain().

I'll take a look into this.

> Also a simple (hacky) solution to fix the problem you have got is to
> divide the range of frequency in steps (which you have already done
> AFAICT in one of your patches) and not mention the frequencies that
> were part of the deadband earlier. That will keep the CPU at either
> low freq or some higher freqs.

Yes, seems worth to try this.

> But I am quite sure that we have an abuser here and we better find it
> now.


Thanks,

Andreas