Re: [PATCH v2 18/20] mips: csrc-r4k: Decrease r4k-clocksource rating if CPU_FREQ enabled

From: Serge Semin
Date: Mon May 11 2020 - 09:31:28 EST


On Fri, May 08, 2020 at 05:41:50PM +0200, Thomas Bogendoerfer wrote:
> On Wed, May 06, 2020 at 08:42:36PM +0300, Sergey.Semin@xxxxxxxxxxxxxxxxxxxx wrote:
> > From: Serge Semin <Sergey.Semin@xxxxxxxxxxxxxxxxxxxx>
> >
> > Commit 07d69579e7fe ("MIPS: Don't register r4k sched clock when
> > CPUFREQ enabled") disabled the r4k-clock usage for scheduler ticks
> > counting due to the scheduler being non-tolerant for unstable
> > clocks sources. For the same reason the clock should be used
> > in the system clocksource framework only as a last resort if CPU
> > frequency may change.
> >
> > Signed-off-by: Serge Semin <Sergey.Semin@xxxxxxxxxxxxxxxxxxxx>
> > Cc: Alexey Malahov <Alexey.Malahov@xxxxxxxxxxxxxxxxxxxx>
> > Cc: Thomas Bogendoerfer <tsbogend@xxxxxxxxxxxxxxxx>
> > Cc: Paul Burton <paulburton@xxxxxxxxxx>
> > Cc: Ralf Baechle <ralf@xxxxxxxxxxxxxx>
> > Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > Cc: Arnd Bergmann <arnd@xxxxxxxx>
> > Cc: Rob Herring <robh+dt@xxxxxxxxxx>
> > Cc: linux-pm@xxxxxxxxxxxxxxx
> > Cc: devicetree@xxxxxxxxxxxxxxx
> > ---
> > arch/mips/kernel/csrc-r4k.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/arch/mips/kernel/csrc-r4k.c b/arch/mips/kernel/csrc-r4k.c
> > index 437dda64fd7a..d81fb374f477 100644
> > --- a/arch/mips/kernel/csrc-r4k.c
> > +++ b/arch/mips/kernel/csrc-r4k.c
> > @@ -71,7 +71,11 @@ int __init init_r4k_clocksource(void)
> > return -ENXIO;
> >
> > /* Calculate a somewhat reasonable rating value */
> > +#ifndef CONFIG_CPU_FREQ
> > clocksource_mips.rating = 200 + mips_hpt_frequency / 10000000;
> > +#else
> > + clocksource_mips.rating = 99;
> > +#endif
>
> I dislike this patch. Assuming you have an other clocksource, why not
> simply disable csrc-r4k, if CPU_FREQ is enabled ?

Me neither and the best way would be to update the clocksource frequency
dynamically the same way it's done for cevt-r4k and MIPS GIC timers. Alas the
clocksource doesn't support it. Due to this together with CPU-freq facility
enabled we have to use a very slow DW APB Timer instead of the fast embedded
into the CPU core r4k and MIPS GIC timers. Just note the difference: it takes
220 ns to read the counter from DW APB Timer in comparison to a few nanoseconds
reading from MIPS GIC and R4K. So IMO disabling the timer as you suggest isn't
the best option. By making the CPUFREQ and CSRC_R4K mutual exclusive we'd
assume a use-case that the system will always use the CPU-freq facility changing
the CPU reference frequency. This is obviously not true. Noone prevents the
system administrator to leave the default setting of the CPU-freq with fixed
frequency and select a faster, more accurate timer like in our case.

My idea was not to try to predict how the system would be used, but to let the
system administration to choose which timer is applicable in particular usecase
enabling a safest one by default. So if CPUFREQ is available, then we fallback
to the external timer as safest one. If the system user wouldn't need to have
the CPUFREQ facility utilized, then the system administrator would want to
leave the default CPU-freq governor with pre-defined CPU frequency and
select either R4K (MIPS) or MIPS GIC timer just by writing its name into
/sys/bus/clocksource/devices/clocksource0/current_clocksource .

I should note, that currently CPU_FREQ won't be available if there is no
MIPS_EXTERNAL_TIMER available for the platform. It's prohibited by means of the
conditional kbuild config inclusion declared in the arch/mips/Kconfig:
+ if CPU_SUPPORTS_CPUFREQ && MIPS_EXTERNAL_TIMER
+ source "drivers/cpufreq/Kconfig"
+ endif
So if there is no external timer working independently from the CPU core clock
source, the CPUFREQ won't be available to select for the kernel. Though currently
this limitation is supposed to be applicable for the R4K/MIPS GIC clocksource
timers only since clockevents must work fine in unstable reference clock conditions.

So what can we do to improve the patch? First one is a solution I suggested in
this patch but it could be a bit altered by using IS_ENABLED() macro to:
+ clocksource_mips.rating = !IS_ENABLED(CONFIG_CPU_FREQ) ?
+ 200 + mips_hpt_frequency / 10000000 : 99;

Another idea I discovered when have been searching through the x86 arch code.
x86's got the same problem with TSC timer, but it doesn't disable it if
CPU-frequency is switched on. Instead it just marks it as unstable by calling
the clocksource_mark_unstable() method if CPU frequency changes. I suggest to
implement the same approach in our case of MIPS GIC (another patchset
I've sent, see "clocksource: Fix MIPS GIC and DW APB Timer for Baikal-T1 SoC
support" in your email client) and R4K timers. We'll subscribe to the CPU
frequency change and if it changes we'll call clocksource_mark_unstable() with
MIPS GIC and R4K clocksource handlers passed. This shall reduce their rating and
cause selecting a clocksource with better one. BTW I suppose it won't be
necessary to initially lower the rating of the MIPS GIC and R4K clocksource
timers if this is implemented.

So, what do you think?

-Sergey

>
> Thomas.
>
> --
> Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
> good idea. [ RFC1925, 2.3 ]