Re: [PATCH V2] cpufreq: Call transition notifier only once for each policy

From: Peter Zijlstra
Date: Mon Mar 18 2019 - 06:54:15 EST


On Mon, Mar 18, 2019 at 08:05:14AM +0530, Viresh Kumar wrote:
> On 15-03-19, 13:29, Peter Zijlstra wrote:
> > On Fri, Mar 15, 2019 at 02:43:07PM +0530, Viresh Kumar wrote:
> > > diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> > > index 3fae23834069..cff8779fc0d2 100644
> > > --- a/arch/x86/kernel/tsc.c
> > > +++ b/arch/x86/kernel/tsc.c
> > > @@ -956,28 +956,38 @@ static int time_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
> > > void *data)
> > > {
> > > struct cpufreq_freqs *freq = data;
> > > - unsigned long *lpj;
> > > -
> > > - lpj = &boot_cpu_data.loops_per_jiffy;
> > > -#ifdef CONFIG_SMP
> > > - if (!(freq->flags & CPUFREQ_CONST_LOOPS))
> > > - lpj = &cpu_data(freq->cpu).loops_per_jiffy;
> > > -#endif
> > > + struct cpumask *cpus = freq->policy->cpus;
> > > + bool boot_cpu = !IS_ENABLED(CONFIG_SMP) || freq->flags & CPUFREQ_CONST_LOOPS;
> > > + unsigned long lpj;
> > > + int cpu;
> > >
> > > if (!ref_freq) {
> > > ref_freq = freq->old;
> > > - loops_per_jiffy_ref = *lpj;
> > > tsc_khz_ref = tsc_khz;
> > > +
> > > + if (boot_cpu)
> > > + loops_per_jiffy_ref = boot_cpu_data.loops_per_jiffy;
> > > + else
> > > + loops_per_jiffy_ref = cpu_data(cpumask_first(cpus)).loops_per_jiffy;
> > > }
> > > +
> > > if ((val == CPUFREQ_PRECHANGE && freq->old < freq->new) ||
> > > (val == CPUFREQ_POSTCHANGE && freq->old > freq->new)) {
> > > - *lpj = cpufreq_scale(loops_per_jiffy_ref, ref_freq, freq->new);
> > > -
> > > + lpj = cpufreq_scale(loops_per_jiffy_ref, ref_freq, freq->new);
> > > tsc_khz = cpufreq_scale(tsc_khz_ref, ref_freq, freq->new);
> > > +
> > > if (!(freq->flags & CPUFREQ_CONST_LOOPS))
> > > mark_tsc_unstable("cpufreq changes");
> > >
> > > - set_cyc2ns_scale(tsc_khz, freq->cpu, rdtsc());
> > > + if (boot_cpu) {
> > > + boot_cpu_data.loops_per_jiffy = lpj;
> > > + } else {
> > > + for_each_cpu(cpu, cpus)
> > > + cpu_data(cpu).loops_per_jiffy = lpj;
> > > + }
> > > +
> > > + for_each_cpu(cpu, cpus)
> > > + set_cyc2ns_scale(tsc_khz, cpu, rdtsc());
> >
> > This code doesn't make sense, the rdtsc() _must_ be called on the CPU in
> > question.
>
> You mean rdtsc() must be locally on that CPU? The cpufreq core never guaranteed
> that and it was left for the notifier to do. This patch doesn't change the
> behavior at all, just that it moves the for-loop to the notifier instead of the
> cpufreq core.

Yuck..

Rafael; how does this work in practise? Earlier you said that on x86 the
policies typically have a single cpu in them anyway. Is the freq change
also notified from _that_ cpu?

I don't think I have old enough hardware around anymore to test any of
this. This was truly ancient p6 era stuff IIRC.

Because in that case, I'm all for not doing the changes to this notifier
Viresh is proposing but simply adding something like:


WARN_ON_ONCE(cpumask_weight(cpuc) != 1);
WARN_ON_ONCE(cpumask_first(cpuc) != smp_processor_id());

And leave it at that.