Re: [PATCH mvebu v2 00/10] Armada 37xx: Fix cpufreq changing base CPU speed to 800 MHz from 1000 MHz

From: Pali Rohár
Date: Sat Feb 13 2021 - 05:11:25 EST


On Thursday 11 February 2021 16:41:13 nnet wrote:
> On Thu, Feb 11, 2021, at 3:44 PM, Pali Rohár wrote:
> > On Thursday 11 February 2021 12:22:52 nnet wrote:
> > > On Thu, Feb 11, 2021, at 11:55 AM, Pali Rohár wrote:
> > > > On Wednesday 10 February 2021 11:08:59 nnet wrote:
> > > > > On Wed, Feb 10, 2021, at 10:03 AM, Pali Rohár wrote:
> > > > > > > > Hello! Could you please enable userspace governor during kernel
> > > > > > > > compilation?
> > > > > > > >
> > > > > > > > CONFIG_CPU_FREQ_GOV_USERSPACE=y
> > > > > > > >
> > > > > > > > It can be activated via command:
> > > > > > > >
> > > > > > > > echo userspace > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
> > > > > > > >
> > > > > > > > After that you can "force" CPU frequency to specific value, e.g.:
> > > > > > > >
> > > > > > > > echo 1000000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed
> > > > > > > >
> > > > > > > > I need to know which switch (from --> to freq) cause this system hang.
> > > > > > > >
> > > > > > > > This patch series (via MIN_VOLT_MV_FOR_L0_L1_1GHZ) is fixing only
> > > > > > > > switching from 500 MHz to 1000 MHz on 1 GHz variant. As only this switch
> > > > > > > > is causing issue.
> > > > > > > >
> > > > > > > > I have used following simple bash script to check that switching between
> > > > > > > > 500 MHz and 1 GHz is stable:
> > > > > > > >
> > > > > > > > while true; do
> > > > > > > > echo 1000000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > > > > > > echo 500000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > > > > > > echo 1000000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > > > > > > echo 500000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > > > > > > done
> > > > > > >
> > > > > > > echo userspace | tee /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
> > > > > > > while true; do
> > > > > > > echo 1200000 | tee /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > > > > > echo 600000 | tee /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > > > > > done
> > > > > > >
> > > > > > > >> +#define MIN_VOLT_MV_FOR_L0_L1_1GHZ 1108
> > > > > > >
> > > > > > > With 1108 I get a freeze within a minute. The last output to stdout is 600000.
> > > > > > >
> > > > > > > With 1120 it takes a few minutes.
> > > > > > >
> > > > > > > With any of 1225, 1155, 1132 the device doesn't freeze over the full 5 minute load test.
> > > > > > >
> > > > > > > I'm using ondemand now with the above at 1132 without issue so far.
> > > > > >
> > > > > > Great, thank you for testing!
> > > > > >
> > > > > > Can you check if switching between any two lower frequencies 200000
> > > > > > 300000 600000 is stable?
> > > > >
> > > > > This is stable using 1132 mV for MIN_VOLT_MV_FOR_L0_L1_1GHZ:
> > > > >
> > > > > while true; do
> > > > > # down
> > > > > echo 1200000 | tee /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > > ...
> > > >
> > > > Hello!
> > > >
> > > > Could you please re-run test without tee, in form as I have shown above?
> > > > UART is slow and printing something to console adds delay which decrease
> > > > probability that real issue is triggered as this is timing issue.
> > >
> > > The test was done over SSH.
> >
> > Ok! But it is still better to not print any results as it adds unwanted
> > delay between frequency switching.
> >
> > > > Also please do tests just between two frequencies in loop as I observed
> > > > that switching between more decreased probability to hit issue.
> > >
> > > > > > > echo userspace | tee /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
> > > > > > > while true; do
> > > > > > > echo 1200000 | tee /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > > > > > echo 600000 | tee /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > > > > > done
> > >
> > > The first test ^ switched between 600 MHz and 1.2 GHz.
> > >
> > > > The real issue for 1 GHz variant of A3720 is only when doing switch from
> > > > 500 MHz to 1 GHz. So could you try to do some tests also without
> > > > changing MIN_VOLT_MV_FOR_L0_L1_1GHZ and switching just between non-1.2
> > > > frequencies (to verify that on 1.2 GHz variant it is also from 600 MHz
> > > > to 1.2 GHz)?
> > >
> > > With 1108 mV and switching between 600 MHz and 1.2GHz, I always saw a freeze within a minute.
> >
> > I mean to try switching with 1.108 V between 200 MHz and 300 MHz or
> > between 300 MHz and 600 MHz. To check that issue is really only with
> > switch from 600 MHz to 1.2 GHz.
>
> With:
>
> +#define MIN_VOLT_MV_FOR_L0_L1_1GHZ 1108
>
> with 5 min load:
>
> # no lock-up
>
> echo userspace > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
> while true; do
> echo 200000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> echo 300000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> done
>
> # no lock-up
>
> echo userspace > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
> while true; do
> echo 300000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> echo 600000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> done
>
> # lock-up with 10 seconds of load applied
>
> echo userspace > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
> while true; do
> echo 600000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> echo 1200000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> done

Ok! So it really looks like that on 1.2 GHz is the same issue. We need
to increase voltage for L1 load (600 MHz). But question is what is the
threshold (it is 1132 mV or lower?) and second question is what
increasing minimal voltage may cause with board.

Basically there is absolutely no information about 1.2 GHz variant and
this issue...

> > I need to know if current settings are fine for 200, 300 and 600 MHz
> > frequencies and the only 600 --> 1200 needs to be fixed.
> >