Re: [TEGRA194_CPUFREQ Patch 2/3] cpufreq: Add Tegra194 cpufreq driver

From: Sumit Gupta
Date: Thu Apr 09 2020 - 07:20:57 EST




On 09/04/20 1:14 PM, Viresh Kumar wrote:
External email: Use caution opening links or attachments


On 08-04-20, 16:54, sumitg wrote:


On 08/04/20 11:23 AM, Viresh Kumar wrote:
External email: Use caution opening links or attachments


On 07-04-20, 23:48, sumitg wrote:
On 06/04/20 8:25 AM, Viresh Kumar wrote:
On 05-04-20, 00:08, sumitg wrote:
On 26/03/20 5:20 PM, Viresh Kumar wrote:
On 03-12-19, 23:02, Sumit Gupta wrote:
diff --git a/drivers/cpufreq/tegra194-cpufreq.c b/drivers/cpufreq/tegra194-cpufreq.c
+static unsigned int tegra194_get_speed_common(u32 cpu, u32 delay)
+{
+ struct read_counters_work read_counters_work;
+ struct tegra_cpu_ctr c;
+ u32 delta_refcnt;
+ u32 delta_ccnt;
+ u32 rate_mhz;
+
+ read_counters_work.c.cpu = cpu;
+ read_counters_work.c.delay = delay;
+ INIT_WORK_ONSTACK(&read_counters_work.work, tegra_read_counters);

Initialize the work only once from init routine.

We are using "read_counters_work" as local variable. So every invocation the function will have its own copy of counters for corresponding cpu. That's why are doing INIT_WORK_ONSTACK here.

+ queue_work_on(cpu, read_counters_wq, &read_counters_work.work);
+ flush_work(&read_counters_work.work);

Why can't this be done in current context ?

We used work queue instead of smp_call_function_single() to have long delay.

Please explain completely, you have raised more questions than you
answered :)

Why do you want to have long delays ?

Long delay value is used to have the observation window long enough for
correctly reconstructing the CPU frequency considering noise.
In next patch version, changed delay value to 500us which in our tests is
considered reliable.

I understand that you need to put a udelay() while reading the freq from
hardware, that is fine, but why do you need a workqueue for that? Why can't you
just read the values directly from the same context ?

The register to read frequency is per core and not accessible to other
cores. So, we have to execute the function remotely as the target core to
read frequency might be different from current.
The functions for that are smp_call_function_single or queue_work_on.
We used queue_work_on() to avoid long delay inside ipi interrupt context
with interrupts disabled.

Okay, I understand this now, finally :)

But if the interrupts are disabled during some call, won't workqueues face the
same problem ?

Yes, we are trying to minimize the case.

--
viresh