Re: [PATCH mvebu v2 00/10] Armada 37xx: Fix cpufreq changing base CPU speed to 800 MHz from 1000 MHz
From: Pali Rohár
Date: Sun Feb 14 2021 - 07:34:13 EST
On Saturday 13 February 2021 10:30:19 nnet wrote:
> On Sat, Feb 13, 2021, at 2:01 AM, Pali Rohár wrote:
> > On Thursday 11 February 2021 16:41:13 nnet wrote:
> > > On Thu, Feb 11, 2021, at 3:44 PM, Pali Rohár wrote:
> > > > On Thursday 11 February 2021 12:22:52 nnet wrote:
> > > > > On Thu, Feb 11, 2021, at 11:55 AM, Pali Rohár wrote:
> > > > > > On Wednesday 10 February 2021 11:08:59 nnet wrote:
> > > > > > > On Wed, Feb 10, 2021, at 10:03 AM, Pali Rohár wrote:
> > > > > > > > > > Hello! Could you please enable userspace governor during kernel
> > > > > > > > > > compilation?
> > > > > > > > > >
> > > > > > > > > > CONFIG_CPU_FREQ_GOV_USERSPACE=y
> > > > > > > > > >
> > > > > > > > > > It can be activated via command:
> > > > > > > > > >
> > > > > > > > > > echo userspace > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
> > > > > > > > > >
> > > > > > > > > > After that you can "force" CPU frequency to specific value, e.g.:
> > > > > > > > > >
> > > > > > > > > > echo 1000000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed
> > > > > > > > > >
> > > > > > > > > > I need to know which switch (from --> to freq) cause this system hang.
> > > > > > > > > >
> > > > > > > > > > This patch series (via MIN_VOLT_MV_FOR_L0_L1_1GHZ) is fixing only
> > > > > > > > > > switching from 500 MHz to 1000 MHz on 1 GHz variant. As only this switch
> > > > > > > > > > is causing issue.
> > > > > > > > > >
> > > > > > > > > > I have used following simple bash script to check that switching between
> > > > > > > > > > 500 MHz and 1 GHz is stable:
> > > > > > > > > >
> > > > > > > > > > while true; do
> > > > > > > > > > echo 1000000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > > > > > > > > echo 500000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > > > > > > > > echo 1000000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > > > > > > > > echo 500000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > > > > > > > > done
> > > > > > > > >
> > > > > > > > > echo userspace | tee /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
> > > > > > > > > while true; do
> > > > > > > > > echo 1200000 | tee /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > > > > > > > echo 600000 | tee /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > > > > > > > done
> > > > > > > > >
> > > > > > > > > >> +#define MIN_VOLT_MV_FOR_L0_L1_1GHZ 1108
> > > > > > > > >
> > > > > > > > > With 1108 I get a freeze within a minute. The last output to stdout is 600000.
> > > > > > > > >
> > > > > > > > > With 1120 it takes a few minutes.
> > > > > > > > >
> > > > > > > > > With any of 1225, 1155, 1132 the device doesn't freeze over the full 5 minute load test.
> > > > > > > > >
> > > > > > > > > I'm using ondemand now with the above at 1132 without issue so far.
> > > > > > > >
> > > > > > > > Great, thank you for testing!
> > > > > > > >
> > > > > > > > Can you check if switching between any two lower frequencies 200000
> > > > > > > > 300000 600000 is stable?
> > > > > > >
> > > > > > > This is stable using 1132 mV for MIN_VOLT_MV_FOR_L0_L1_1GHZ:
> > > > > > >
> > > > > > > while true; do
> > > > > > > # down
> > > > > > > echo 1200000 | tee /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > > > > ...
> > > > > >
> > > > > > Hello!
> > > > > >
> > > > > > Could you please re-run test without tee, in form as I have shown above?
> > > > > > UART is slow and printing something to console adds delay which decrease
> > > > > > probability that real issue is triggered as this is timing issue.
> > > > >
> > > > > The test was done over SSH.
> > > >
> > > > Ok! But it is still better to not print any results as it adds unwanted
> > > > delay between frequency switching.
> > > >
> > > > > > Also please do tests just between two frequencies in loop as I observed
> > > > > > that switching between more decreased probability to hit issue.
> > > > >
> > > > > > > > > echo userspace | tee /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
> > > > > > > > > while true; do
> > > > > > > > > echo 1200000 | tee /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > > > > > > > echo 600000 | tee /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > > > > > > > done
> > > > >
> > > > > The first test ^ switched between 600 MHz and 1.2 GHz.
> > > > >
> > > > > > The real issue for 1 GHz variant of A3720 is only when doing switch from
> > > > > > 500 MHz to 1 GHz. So could you try to do some tests also without
> > > > > > changing MIN_VOLT_MV_FOR_L0_L1_1GHZ and switching just between non-1.2
> > > > > > frequencies (to verify that on 1.2 GHz variant it is also from 600 MHz
> > > > > > to 1.2 GHz)?
> > > > >
> > > > > With 1108 mV and switching between 600 MHz and 1.2GHz, I always saw a freeze within a minute.
> > > >
> > > > I mean to try switching with 1.108 V between 200 MHz and 300 MHz or
> > > > between 300 MHz and 600 MHz. To check that issue is really only with
> > > > switch from 600 MHz to 1.2 GHz.
> > >
> > > With:
> > >
> > > +#define MIN_VOLT_MV_FOR_L0_L1_1GHZ 1108
> > >
> > > with 5 min load:
> > >
> > > # no lock-up
> > >
> > > echo userspace > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
> > > while true; do
> > > echo 200000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > echo 300000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > done
> > >
> > > # no lock-up
> > >
> > > echo userspace > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
> > > while true; do
> > > echo 300000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > echo 600000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > done
> > >
> > > # lock-up with 10 seconds of load applied
> > >
> > > echo userspace > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
> > > while true; do
> > > echo 600000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > echo 1200000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed;
> > > done
> >
> > Ok! So it really looks like that on 1.2 GHz is the same issue. We need
> > to increase voltage for L1 load (600 MHz). But question is what is the
> > threshold (it is 1132 mV or lower?)
>
> Lower then 1132 mV causes freezes. No freezes with 1132 mV.
>
> > and second question is what
> > increasing minimal voltage may cause with board.
>
> I don't have any further information on the issue and device then that experience.
>
> I've tested on two of these devices which suggests it's not an oddity with one instance of it.
Ok, thank you for information.
Could you test following change instead of PATCH 04/10? I added here also
logic for 1.2 GHz variant with 1.132 V value and another change is that
value for load L0 is not touched as it is stable.
diff --git a/drivers/cpufreq/armada-37xx-cpufreq.c b/drivers/cpufreq/armada-37xx-cpufreq.c
index b8dc6c849..8a4afb048 100644
--- a/drivers/cpufreq/armada-37xx-cpufreq.c
+++ b/drivers/cpufreq/armada-37xx-cpufreq.c
@@ -73,6 +73,8 @@
#define LOAD_LEVEL_NR 4
#define MIN_VOLT_MV 1000
+#define MIN_VOLT_MV_FOR_L1_1000MHZ 1108
+#define MIN_VOLT_MV_FOR_L1_1200MHZ 1132
/* AVS value for the corresponding voltage (in mV) */
static int avs_map[] = {
@@ -208,6 +210,8 @@ static u32 armada_37xx_avs_val_match(int target_vm)
* - L2 & L3 voltage should be about 150mv smaller than L0 voltage.
* This function calculates L1 & L2 & L3 AVS values dynamically based
* on L0 voltage and fill all AVS values to the AVS value table.
+ * When base CPU frequency is 1000 or 1200 MHz then there is additional
+ * minimal avs value for load L1.
*/
static void __init armada37xx_cpufreq_avs_configure(struct regmap *base,
struct armada_37xx_dvfs *dvfs)
@@ -239,6 +243,18 @@ static void __init armada37xx_cpufreq_avs_configure(struct regmap *base,
for (load_level = 1; load_level < LOAD_LEVEL_NR; load_level++)
dvfs->avs[load_level] = avs_min;
+ /*
+ * Set the avs value for load L0 and L1 when base CPU frequency is 1000/1200 MHz,
+ * otherwise the CPU gets stuck when switching from load L1 to load L0.
+ */
+ if (dvfs->cpu_freq_max >= 1000*1000*1000) {
+ if (dvfs->cpu_freq_max >= 1200*1000*1000)
+ avs_min = armada_37xx_avs_val_match(MIN_VOLT_MV_FOR_L1_1200MHZ);
+ else
+ avs_min = armada_37xx_avs_val_match(MIN_VOLT_MV_FOR_L1_1000MHZ);
+ dvfs->avs[0] = dvfs->avs[1] = avs_min;
+ }
+
return;
}
@@ -258,6 +274,26 @@ static void __init armada37xx_cpufreq_avs_configure(struct regmap *base,
target_vm = avs_map[l0_vdd_min] - 150;
target_vm = target_vm > MIN_VOLT_MV ? target_vm : MIN_VOLT_MV;
dvfs->avs[2] = dvfs->avs[3] = armada_37xx_avs_val_match(target_vm);
+
+ /*
+ * Fix the avs value for load L1 when base CPU frequency is 1000/1200 MHz,
+ * otherwise the CPU gets stuck when switching from load L1 to load L0.
+ * Also ensure that avs value for load L1 is not higher than for L0.
+ */
+ if (dvfs->cpu_freq_max >= 1000*1000*1000) {
+ u32 avs_min_l1;
+
+ if (dvfs->cpu_freq_max >= 1200*1000*1000)
+ avs_min_l1 = armada_37xx_avs_val_match(MIN_VOLT_MV_FOR_L1_1200MHZ);
+ else
+ avs_min_l1 = armada_37xx_avs_val_match(MIN_VOLT_MV_FOR_L1_1000MHZ);
+
+ if (avs_min_l1 > dvfs->avs[0])
+ avs_min_l1 = dvfs->avs[0];
+
+ if (dvfs->avs[1] < avs_min_l1)
+ dvfs->avs[1] = avs_min_l1;
+ }
}
static void __init armada37xx_cpufreq_avs_setup(struct regmap *base,
> >
> > Basically there is absolutely no information about 1.2 GHz variant and
> > this issue...
> >
> > > > I need to know if current settings are fine for 200, 300 and 600 MHz
> > > > frequencies and the only 600 --> 1200 needs to be fixed.
> > > >
> >